Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters

Country/Region as subject
Publication year range
1.
BMC Neurol ; 23(1): 2, 2023 Jan 04.
Article in English | MEDLINE | ID: mdl-36597038

ABSTRACT

BACKGROUND: Although of high individual and socioeconomic relevance, a reliable prediction model for the prognosis of juvenile stroke (18-55 years) is missing. Therefore, the study presented in this protocol aims to prospectively validate the discriminatory power of a prediction score for the 3 months functional outcome after juvenile stroke or transient ischemic attack (TIA) that has been derived from an independent retrospective study using standard clinical workup data. METHODS: PREDICT-Juvenile-Stroke is a multi-centre (n = 4) prospective observational cohort study collecting standard clinical workup data and data on treatment success at 3 months after acute ischemic stroke or TIA that aims to validate a new prediction score for juvenile stroke. The prediction score has been developed upon single center retrospective analysis of 340 juvenile stroke patients. The score determines the patient's individual probability for treatment success defined by a modified Rankin Scale (mRS) 0-2 or return to pre-stroke baseline mRS 3 months after stroke or TIA. This probability will be compared to the observed clinical outcome at 3 months using the area under the receiver operating characteristic curve. The primary endpoint is to validate the clinical potential of the new prediction score for a favourable outcome 3 months after juvenile stroke or TIA. Secondary outcomes are to determine to what extent predictive factors in juvenile stroke or TIA patients differ from those in older patients and to determine the predictive accuracy of the juvenile stroke prediction score on other clinical and paraclinical endpoints. A minimum of 430 juvenile patients (< 55 years) with acute ischemic stroke or TIA, and the same number of older patients will be enrolled for the prospective validation study. DISCUSSION: The juvenile stroke prediction score has the potential to enable personalisation of counselling, provision of appropriate information regarding the prognosis and identification of patients who benefit from specific treatments. TRIAL REGISTRATION: The study has been registered at https://drks.de on March 31, 2022 ( DRKS00024407 ).


Subject(s)
Ischemic Attack, Transient , Ischemic Stroke , Stroke , Humans , Young Adult , Aged , Ischemic Attack, Transient/diagnosis , Ischemic Attack, Transient/epidemiology , Ischemic Attack, Transient/complications , Ischemic Stroke/complications , Retrospective Studies , Stroke/diagnosis , Stroke/epidemiology , Stroke/complications , Prognosis , Predictive Value of Tests , Observational Studies as Topic
2.
Emerg Infect Dis ; 28(3): 572-581, 2022 03.
Article in English | MEDLINE | ID: mdl-35195515

ABSTRACT

Hospital staff are at high risk for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection during the coronavirus disease (COVID-19) pandemic. This cross-sectional study aimed to determine the prevalence of SARS-CoV-2 infection in hospital staff at the University Hospital rechts der Isar in Munich, Germany, and identify modulating factors. Overall seroprevalence of SARS-CoV-2-IgG in 4,554 participants was 2.4%. Staff engaged in direct patient care, including those working in COVID-19 units, had a similar probability of being seropositive as non-patient-facing staff. Increased probability of infection was observed in staff reporting interactions with SARS-CoV-2‒infected coworkers or private contacts or exposure to COVID-19 patients without appropriate personal protective equipment. Analysis of spatiotemporal trajectories identified that distinct hotspots for SARS-CoV-2‒positive staff and patients only partially overlap. Patient-facing work in a healthcare facility during the SARS-CoV-2 pandemic might be safe as long as adequate personal protective equipment is used and infection prevention practices are followed inside and outside the hospital.


Subject(s)
COVID-19 , SARS-CoV-2 , Cross-Sectional Studies , Germany/epidemiology , Health Personnel , Hospitals, University , Humans , Immunoglobulin G , Infection Control , Personnel, Hospital , Prevalence , Seroepidemiologic Studies
3.
Eur J Public Health ; 32(3): 422-428, 2022 06 01.
Article in English | MEDLINE | ID: mdl-35165720

ABSTRACT

BACKGROUND: Heterozygous familial hypercholesterolemia (FH) represents the most frequent monogenic disorder with an estimated prevalence of 1:250 in the general population. Diagnosis during childhood enables early initiation of preventive measures, reducing the risk of severe consecutive atherosclerotic manifestations. Nevertheless, population-based screening programs for FH are scarce. METHODS: In the VRONI study, children aged 5-14 years in Bavaria are invited to participate in an FH screening program during regular pediatric visits. The screening is based on low-density lipoprotein cholesterol measurements from capillary blood. If exceeding 130 mg/dl (3.34 mmol/l), i.e. the expected 95th percentile in this age group, subsequent molecular genetic analysis for FH is performed. Children with FH pathogenic variants enter a registry and are treated by specialized pediatricians. Furthermore, qualified training centers offer FH-focused training courses to affected families. For first-degree relatives, reverse cascade screening is recommended to identify and treat affected family members. RESULTS: Implementation of VRONI required intensive prearrangements for addressing ethical, educational, data safety, legal and organizational aspects, which will be outlined in this article. Recruitment started in early 2021, within the first months, more than 380 pediatricians screened over 5200 children. Approximately 50 000 children are expected to be enrolled in the VRONI study until 2024. CONCLUSIONS: VRONI aims to test the feasibility of a population-based screening for FH in children in Bavaria, intending to set the stage for a nationwide FH screening infrastructure. Furthermore, we aim to validate genetic variants of unclear significance, detect novel causative mutations and contribute to polygenic risk indices (DRKS00022140; August 2020).


Subject(s)
Hyperlipoproteinemia Type II , Aged, 80 and over , Child , Early Diagnosis , Humans , Hyperlipoproteinemia Type II/diagnosis , Hyperlipoproteinemia Type II/epidemiology , Hyperlipoproteinemia Type II/genetics , Mass Screening
4.
Dis Esophagus ; 32(8)2019 Aug 01.
Article in English | MEDLINE | ID: mdl-31329831

ABSTRACT

Risk stratification in patients with Barrett's esophagus (BE) to prevent the development of esophageal adenocarcinoma (EAC) is an unsolved task. The incidence of EAC and BE is increasing and patients are still at unknown risk. BarrettNET is an ongoing multicenter prospective cohort study initiated to identify and validate molecular and clinical biomarkers that allow a more personalized surveillance strategy for patients with BE. For BarrettNET participants are recruited in 20 study centers throughout Germany, to be followed for progression to dysplasia (low-grade dysplasia or high-grade dysplasia) or EAC for >10 years. The study instruments comprise self-administered epidemiological information (containing data on demographics, lifestyle factors, and health), as well as biological specimens, i.e., blood-based samples, esophageal tissue biopsies, and feces and saliva samples. In follow-up visits according to the individual surveillance plan of the participants, sample collection is repeated. The standardized collection and processing of the specimen guarantee the highest sample quality. Via a mobile accessible database, the documentation of inclusion, epidemiological data, and pathological disease status are recorded subsequently. Currently the BarrettNET registry includes 560 participants (23.1% women and 76.9% men, aged 22-92 years) with a median follow-up of 951 days. Both the design and the size of BarrettNET offer the advantage of answering research questions regarding potential causes of disease progression from BE to EAC. Here all the integrated methods and materials of BarrettNET are presented and reviewed to introduce this valuable German registry.


Subject(s)
Adenocarcinoma/diagnosis , Barrett Esophagus/complications , Early Detection of Cancer/methods , Esophageal Neoplasms/diagnosis , Population Surveillance/methods , Risk Assessment/methods , Adenocarcinoma/etiology , Adult , Aged , Aged, 80 and over , Biomarkers/analysis , Clinical Decision Rules , Disease Progression , Esophageal Neoplasms/etiology , Female , Germany , Humans , Male , Middle Aged , Prospective Studies , Registries , Risk Factors , Young Adult
5.
BMC Med Inform Decis Mak ; 19(1): 178, 2019 09 04.
Article in English | MEDLINE | ID: mdl-31484555

ABSTRACT

BACKGROUND: The collection of data and biospecimens which characterize patients and probands in-depth is a core element of modern biomedical research. Relevant data must be considered highly sensitive and it needs to be protected from unauthorized use and re-identification. In this context, laws, regulations, guidelines and best-practices often recommend or mandate pseudonymization, which means that directly identifying data of subjects (e.g. names and addresses) is stored separately from data which is primarily needed for scientific analyses. DISCUSSION: When (authorized) re-identification of subjects is not an exceptional but a common procedure, e.g. due to longitudinal data collection, implementing pseudonymization can significantly increase the complexity of software solutions. For example, data stored in distributed databases, need to be dynamically combined with each other, which requires additional interfaces for communicating between the various subsystems. This increased complexity may lead to new attack vectors for intruders. Obviously, this is in contrast to the objective of improving data protection. What is lacking is a standardized process of evaluating and reporting risks, threats and countermeasures, which can be used to test whether integrating pseudonymization methods into data collection systems actually improves upon the degree of protection provided by system designs that simply follow common IT security best practices and implement fine-grained role-based access control models. To demonstrate that the methods used to describe systems employing pseudonymized data management are currently heterogeneous and ad-hoc, we examined the extent to which twelve recent studies address each of the six basic security properties defined by the International Organization for Standardization (ISO) standard 27,000. We show inconsistencies across the studies, with most of them failing to mention one or more security properties. CONCLUSION: We discuss the degree of privacy protection provided by implementing pseudonymization into research data collection processes. We conclude that (1) more research is needed on the interplay of pseudonymity, information security and data protection, (2) problem-specific guidelines for evaluating and reporting risks, threats and countermeasures should be developed and that (3) future work on pseudonymized research data collection should include the results of such structured and integrated analyses.


Subject(s)
Anonyms and Pseudonyms , Biomedical Research , Confidentiality , Computer Communication Networks , Computer Security/standards , Humans
6.
BMC Med Inform Decis Mak ; 17(1): 30, 2017 03 23.
Article in English | MEDLINE | ID: mdl-28330491

ABSTRACT

BACKGROUND: Translational researchers need robust IT solutions to access a range of data types, varying from public data sets to pseudonymised patient information with restricted access, provided on a case by case basis. The reason for this complication is that managing access policies to sensitive human data must consider issues of data confidentiality, identifiability, extent of consent, and data usage agreements. All these ethical, social and legal aspects must be incorporated into a differential management of restricted access to sensitive data. METHODS: In this paper we present a pilot system that uses several common open source software components in a novel combination to coordinate access to heterogeneous biomedical data repositories containing open data (open access) as well as sensitive data (restricted access) in the domain of biobanking and biosample research. Our approach is based on a digital identity federation and software to manage resource access entitlements. RESULTS: Open source software components were assembled and configured in such a way that they allow for different ways of restricted access according to the protection needs of the data. We have tested the resulting pilot infrastructure and assessed its performance, feasibility and reproducibility. CONCLUSIONS: Common open source software components are sufficient to allow for the creation of a secure system for differential access to sensitive data. The implementation of this system is exemplary for researchers facing similar requirements for restricted access data. Here we report experience and lessons learnt of our pilot implementation, which may be useful for similar use cases. Furthermore, we discuss possible extensions for more complex scenarios.


Subject(s)
Biological Specimen Banks/standards , Biomedical Research/standards , Computer Security/standards , Datasets as Topic , Translational Research, Biomedical/standards , Humans , Pilot Projects
7.
BMC Med Inform Decis Mak ; 16: 49, 2016 Apr 30.
Article in English | MEDLINE | ID: mdl-27130179

ABSTRACT

BACKGROUND: Privacy must be protected when sensitive biomedical data is shared, e.g. for research purposes. Data de-identification is an important safeguard, where datasets are transformed to meet two conflicting objectives: minimizing re-identification risks while maximizing data quality. Typically, de-identification methods search a solution space of possible data transformations to find a good solution to a given de-identification problem. In this process, parts of the search space must be excluded to maintain scalability. OBJECTIVES: The set of transformations which are solution candidates is typically narrowed down by storing the results obtained during the search process and then using them to predict properties of the output of other transformations in terms of privacy (first objective) and data quality (second objective). However, due to the exponential growth of the size of the search space, previous implementations of this method are not well-suited when datasets contain many attributes which need to be protected. As this is often the case with biomedical research data, e.g. as a result of longitudinal collection, we have developed a novel method. METHODS: Our approach combines the mathematical concept of antichains with a data structure inspired by prefix trees to represent properties of a large number of data transformations while requiring only a minimal amount of information to be stored. To analyze the improvements which can be achieved by adopting our method, we have integrated it into an existing algorithm and we have also implemented a simple best-first branch and bound search (BFS) algorithm as a first step towards methods which fully exploit our approach. We have evaluated these implementations with several real-world datasets and the k-anonymity privacy model. RESULTS: When integrated into existing de-identification algorithms for low-dimensional data, our approach reduced memory requirements by up to one order of magnitude and execution times by up to 25 %. This allowed us to increase the size of solution spaces which could be processed by almost a factor of 10. When using the simple BFS method, we were able to further increase the size of the solution space by a factor of three. When used as a heuristic strategy for high-dimensional data, the BFS approach outperformed a state-of-the-art algorithm by up to 12 % in terms of the quality of output data. CONCLUSIONS: This work shows that implementing methods of data de-identification for real-world applications is a challenging task. Our approach solves a problem often faced by data custodians: a lack of scalability of de-identification software when used with datasets having realistic schemas and volumes. The method described in this article has been implemented into ARX, an open source de-identification software for biomedical data.


Subject(s)
Algorithms , Confidentiality , Medical Informatics/methods , Models, Statistical , Humans
8.
J Biomed Inform ; 58: 37-48, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26385376

ABSTRACT

OBJECTIVE: With the ARX data anonymization tool structured biomedical data can be de-identified using syntactic privacy models, such as k-anonymity. Data is transformed with two methods: (a) generalization of attribute values, followed by (b) suppression of data records. The former method results in data that is well suited for analyses by epidemiologists, while the latter method significantly reduces loss of information. Our tool uses an optimal anonymization algorithm that maximizes output utility according to a given measure. To achieve scalability, existing optimal anonymization algorithms exclude parts of the search space by predicting the outcome of data transformations regarding privacy and utility without explicitly applying them to the input dataset. These optimizations cannot be used if data is transformed with generalization and suppression. As optimal data utility and scalability are important for anonymizing biomedical data, we had to develop a novel method. METHODS: In this article, we first confirm experimentally that combining generalization with suppression significantly increases data utility. Next, we proof that, within this coding model, the outcome of data transformations regarding privacy and utility cannot be predicted. As a consequence, existing algorithms fail to deliver optimal data utility. We confirm this finding experimentally. The limitation of previous work can be overcome at the cost of increased computational complexity. However, scalability is important for anonymizing data with user feedback. Consequently, we identify properties of datasets that may be predicted in our context and propose a novel and efficient algorithm. Finally, we evaluate our solution with multiple datasets and privacy models. RESULTS: This work presents the first thorough investigation of which properties of datasets can be predicted when data is anonymized with generalization and suppression. Our novel approach adopts existing optimization strategies to our context and combines different search methods. The experiments show that our method is able to efficiently solve a broad spectrum of anonymization problems. CONCLUSION: Our work shows that implementing syntactic privacy models is challenging and that existing algorithms are not well suited for anonymizing data with transformation models which are more complex than generalization alone. As such models have been recommended for use in the biomedical domain, our results are of general relevance for de-identifying structured biomedical data.


Subject(s)
Information Services/economics , Information Services/standards , Computer Security , Models, Theoretical , Privacy
9.
BMC Med Inform Decis Mak ; 15: 100, 2015 Nov 30.
Article in English | MEDLINE | ID: mdl-26621059

ABSTRACT

BACKGROUND: Collaborative collection and sharing of data have become a core element of biomedical research. Typical applications are multi-site registries which collect sensitive person-related data prospectively, often together with biospecimens. To secure these sensitive data, national and international data protection laws and regulations demand the separation of identifying data from biomedical data and to introduce pseudonyms. Neither the formulation in laws and regulations nor existing pseudonymization concepts, however, are precise enough to directly provide an implementation guideline. We therefore describe core requirements as well as implementation options for registries and study databases with sensitive biomedical data. METHODS: We first analyze existing concepts and compile a set of fundamental requirements for pseudonymized data management. Then we derive a system architecture that fulfills these requirements. Next, we provide a comprehensive overview and a comparison of different technical options for an implementation. Finally, we develop a generic software solution for managing pseudonymized data and show its feasibility by describing how we have used it to realize two research networks. RESULTS: We have found that pseudonymization models are highly heterogeneous, already on a conceptual level. We have compiled a set of requirements from different pseudonymization schemes. We propose an architecture and present an overview of technical options. Based on a selection of technical elements, we suggest a generic solution. It supports the multi-site collection and management of biomedical data. Security measures are multi-tier pseudonymity and physical separation of data over independent backend servers. Integrated views are provided by a web-based user interface. Our approach has been successfully used to implement a national and an international rare disease network. CONCLUSIONS: We were able to identify a set of core requirements out of several pseudonymization models. Considering various implementation options, we realized a generic solution which was implemented and deployed in research networks. Still, further conceptual work on pseudonymity is needed. Specifically, it remains unclear how exactly data is to be separated into distributed subsets. Moreover, a thorough risk and threat analysis is needed.


Subject(s)
Biomedical Research/standards , Confidentiality/standards , Datasets as Topic/standards , Guidelines as Topic/standards , Registries/standards , Humans
10.
J Biomed Inform ; 50: 62-76, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24333850

ABSTRACT

Sensitive biomedical data is often collected from distributed sources, involving different information systems and different organizational units. Local autonomy and legal reasons lead to the need of privacy preserving integration concepts. In this article, we focus on anonymization, which plays an important role for the re-use of clinical data and for the sharing of research data. We present a flexible solution for anonymizing distributed data in the semi-honest model. Prior to the anonymization procedure, an encrypted global view of the dataset is constructed by means of a secure multi-party computing (SMC) protocol. This global representation can then be anonymized. Our approach is not limited to specific anonymization algorithms but provides pre- and postprocessing for a broad spectrum of algorithms and many privacy criteria. We present an extensive analytical and experimental evaluation and discuss which types of methods and criteria are supported. Our prototype demonstrates the approach by implementing k-anonymity, ℓ-diversity, t-closeness and δ-presence with a globally optimal de-identification method in horizontally and vertically distributed setups. The experiments show that our method provides highly competitive performance and offers a practical and flexible solution for anonymizing distributed biomedical datasets.


Subject(s)
Medical Records Systems, Computerized , Privacy , Algorithms , Models, Theoretical
11.
Neurol Res Pract ; 6(1): 15, 2024 Mar 07.
Article in English | MEDLINE | ID: mdl-38449051

ABSTRACT

INTRODUCTION: In Multiple Sclerosis (MS), patients´ characteristics and (bio)markers that reliably predict the individual disease prognosis at disease onset are lacking. Cohort studies allow a close follow-up of MS histories and a thorough phenotyping of patients. Therefore, a multicenter cohort study was initiated to implement a wide spectrum of data and (bio)markers in newly diagnosed patients. METHODS: ProVal-MS (Prospective study to validate a multidimensional decision score that predicts treatment outcome at 24 months in untreated patients with clinically isolated syndrome or early Relapsing-Remitting-MS) is a prospective cohort study in patients with clinically isolated syndrome (CIS) or Relapsing-Remitting (RR)-MS (McDonald 2017 criteria), diagnosed within the last two years, conducted at five academic centers in Southern Germany. The collection of clinical, laboratory, imaging, and paraclinical data as well as biosamples is harmonized across centers. The primary goal is to validate (discrimination and calibration) the previously published DIFUTURE MS-Treatment Decision score (MS-TDS). The score supports clinical decision-making regarding the options of early (within 6 months after study baseline) platform medication (Interferon beta, glatiramer acetate, dimethyl/diroximel fumarate, teriflunomide), or no immediate treatment (> 6 months after baseline) of patients with early RR-MS and CIS by predicting the probability of new or enlarging lesions in cerebral magnetic resonance images (MRIs) between 6 and 24 months. Further objectives are refining the MS-TDS score and providing data to identify new markers reflecting disease course and severity. The project also provides a technical evaluation of the ProVal-MS cohort within the IT-infrastructure of the DIFUTURE consortium (Data Integration for Future Medicine) and assesses the efficacy of the data sharing techniques developed. PERSPECTIVE: Clinical cohorts provide the infrastructure to discover and to validate relevant disease-specific findings. A successful validation of the MS-TDS will add a new clinical decision tool to the armamentarium of practicing MS neurologists from which newly diagnosed MS patients may take advantage. Trial registration ProVal-MS has been registered in the German Clinical Trials Register, `Deutsches Register Klinischer Studien` (DRKS)-ID: DRKS00014034, date of registration: 21 December 2018; https://drks.de/search/en/trial/DRKS00014034.

12.
Med Genet ; 34(1): 41-51, 2022 Apr.
Article in English | MEDLINE | ID: mdl-38836010

ABSTRACT

Familial hypercholesterolemia (FH) is the most frequent monogenic disorder (prevalence 1:250) in the general population. Early diagnosis during childhood enables pre-emptive treatment, thus reducing the risk of severe atherosclerotic manifestations later in life. Nonetheless, FH screening programs are scarce. VRONI offers all children aged 5-14 years in Bavaria a FH screening in the context of regular pediatric visits. LDL-cholesterol (LDL-C) is measured centrally, followed by genetic analysis for FH if exceeding the age-specific 95th percentile (130 mg/dl, 3.34 mmol/l). Children with FH pathogenic variants are treated by specialized pediatricians and offered a FH-focused training course by a qualified training center. Reverse cascade screening is recommended for all first-degree relatives. VRONI aims to prove the feasibility of a population-based FH screening in children and to lay the foundation for a nationwide screening program.

13.
Sci Data ; 7(1): 435, 2020 12 10.
Article in English | MEDLINE | ID: mdl-33303746

ABSTRACT

The Lean European Open Survey on SARS-CoV-2 Infected Patients (LEOSS) is a European registry for studying the epidemiology and clinical course of COVID-19. To support evidence-generation at the rapid pace required in a pandemic, LEOSS follows an Open Science approach, making data available to the public in real-time. To protect patient privacy, quantitative anonymization procedures are used to protect the continuously published data stream consisting of 16 variables on the course and therapy of COVID-19 from singling out, inference and linkage attacks. We investigated the bias introduced by this process and found that it has very little impact on the quality of output data. Current laws do not specify requirements for the application of formal anonymization methods, there is a lack of guidelines with clear recommendations and few real-world applications of quantitative anonymization procedures have been described in the literature. We therefore believe that our work can help others with developing urgently needed anonymization pipelines for their projects.


Subject(s)
COVID-19/epidemiology , Data Anonymization , Pandemics , Registries , Adult , Aged , Aged, 80 and over , Biomedical Research , Confidentiality , Datasets as Topic , Female , Humans , Male , Middle Aged
14.
Cancer Prev Res (Phila) ; 13(4): 377-384, 2020 04.
Article in English | MEDLINE | ID: mdl-32066580

ABSTRACT

Endoscopic screening for Barrett's esophagus as the major precursor lesion for esophageal adenocarcinoma is mostly offered to patients with symptoms of gastroesophageal reflux disease (GERD). However, other epidemiologic risk factors might affect the development of Barrett's esophagus and esophageal adenocarcinoma. Therefore, efforts to improve the efficiency of screening to find the Barrett's esophagus population "at risk" compared with the normal population are needed. In a cross-sectional analysis, we compared 587 patients with Barrett's esophagus from the multicenter German BarrettNET registry to 1976 healthy subjects from the population-based German KORA cohort, with and without GERD symptoms. Data on demographic and lifestyle factors, including age, gender, smoking, alcohol consumption, body mass index, physical activity, and symptoms were collected in a standardized epidemiologic survey. Increased age, male gender, smoking, heavy alcohol consumption, low physical activity, low health status, and GERD symptoms were significantly associated with Barrett's esophagus. Surprisingly, among patients stratified for GERD symptoms, these associations did not change. Demographic, lifestyle, and clinical factors as well as GERD symptoms were associated with Barrett's esophagus development in Germany, suggesting that a combination of risk factors could be useful in developing individualized screening efforts for patients with Barrett's esophagus and GERD in Germany.


Subject(s)
Adenocarcinoma/epidemiology , Alcohol Drinking/adverse effects , Barrett Esophagus/epidemiology , Esophageal Neoplasms/epidemiology , Gastroesophageal Reflux/epidemiology , Registries/statistics & numerical data , Smoking/adverse effects , Adenocarcinoma/etiology , Adenocarcinoma/pathology , Adult , Aged , Aged, 80 and over , Barrett Esophagus/etiology , Barrett Esophagus/pathology , Body Mass Index , Case-Control Studies , Cross-Sectional Studies , Esophageal Neoplasms/etiology , Esophageal Neoplasms/pathology , Female , Follow-Up Studies , Gastroesophageal Reflux/etiology , Gastroesophageal Reflux/pathology , Germany/epidemiology , Humans , Male , Middle Aged , Prognosis , Prospective Studies , Retrospective Studies , Risk Factors , Young Adult
15.
IEEE J Biomed Health Inform ; 22(2): 611-622, 2018 03.
Article in English | MEDLINE | ID: mdl-28358693

ABSTRACT

The sharing of sensitive personal health data is an important aspect of biomedical research. Methods of data de-identification are often used in this process to trade the granularity of data off against privacy risks. However, traditional approaches, such as HIPAA safe harbor or -anonymization, often fail to provide data with sufficient quality. Alternatively, data can be de-identified only to a degree which still allows us to use it as required, e.g., to carry out specific analyses. Controlled environments, which restrict the ways recipients can interact with the data, can then be used to cope with residual risks. The contributions of this article are twofold. First, we present a method for implementing controlled data sharing environments and analyze its privacy properties. Second, we present a de-identification method which is specifically suited for sanitizing health data which is to be shared in such environments. Traditional de-identification methods control the uniqueness of records in a dataset. The basic idea of our approach is to reduce the probability that a record in a dataset has characteristics which are unique within the underlying population. As the characteristics of the population are typically not known, we have implemented a pragmatic solution in which properties of the population are modeled with statistical methods. We have further developed an accompanying process for evaluating and validating the degree of protection provided. The results of an extensive experimental evaluation show that our approach enables the safe sharing of high-quality data and that it is highly scalable.


Subject(s)
Confidentiality , Databases, Factual , Information Dissemination/methods , Medical Records , Algorithms , Biomedical Research , Humans
16.
Biopreserv Biobank ; 16(2): 97-105, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29359962

ABSTRACT

The known challenge of underutilization of data and biological material from biorepositories as potential resources for medical research has been the focus of discussion for over a decade. Recently developed guidelines for improved data availability and reusability-entitled FAIR Principles (Findability, Accessibility, Interoperability, and Reusability)-are likely to address only parts of the problem. In this article, we argue that biological material and data should be viewed as a unified resource. This approach would facilitate access to complete provenance information, which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for optimization of long-term storage strategies, as demonstrated in the case of biobanks. We propose an extension of the FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human material and data. These FAIR-Health principles should then be applied to both the biological material and data. We also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of volume and breadth of medical data generation, as well as the associated need to process the data efficiently.


Subject(s)
Biological Specimen Banks , Confidentiality/standards , Databases, Factual/standards , Information Dissemination/methods , Biological Specimen Banks/organization & administration , Biological Specimen Banks/standards , Guidelines as Topic , Humans
17.
Methods Inf Med ; 55(4): 347-55, 2016 Aug 05.
Article in English | MEDLINE | ID: mdl-27322502

ABSTRACT

BACKGROUND: Data sharing is a central aspect of modern biomedical research. It is accompanied by significant privacy concerns and often data needs to be protected from re-identification. With methods of de-identification datasets can be transformed in such a way that it becomes extremely difficult to link their records to identified individuals. The most important challenge in this process is to find an adequate balance between an increase in privacy and a decrease in data quality. OBJECTIVES: Accurately measuring the risk of re-identification in a specific data sharing scenario is an important aspect of data de-identification. Overestimation of risks will significantly deteriorate data quality, while underestimation will leave data prone to attacks on privacy. Several models have been proposed for measuring risks, but there is a lack of generic methods for risk-based data de-identification. The aim of the work described in this article was to bridge this gap and to show how the quality of de-identified datasets can be improved by using risk models to tailor the process of de-identification to a concrete context. METHODS: We implemented a generic de-identification process and several models for measuring re-identification risks into the ARX de-identification tool for biomedical data. By integrating the methods into an existing framework, we were able to automatically transform datasets in such a way that information loss is minimized while it is ensured that re-identification risks meet a user-defined threshold. We performed an extensive experimental evaluation to analyze the impact of using different risk models and assumptions about the goals and the background knowledge of an attacker on the quality of de-identified data. RESULTS: The results of our experiments show that data quality can be improved significantly by using risk models for data de-identification. On a scale where 100 % represents the original input dataset and 0 % represents a dataset from which all information has been removed, the loss of information content could be reduced by up to 10 % when protecting datasets against strong adversaries and by up to 24 % when protecting datasets against weaker adversaries. CONCLUSIONS: The methods studied in this article are well suited for protecting sensitive biomedical data and our implementation is available as open-source software. Our results can be used by data custodians to increase the information content of de-identified data by tailoring the process to a specific data sharing scenario. Improving data quality is important for fostering the adoption of de-identification methods in biomedical research.


Subject(s)
Biomedical Research , Databases, Factual , Patient Identification Systems , Computer Security , Data Accuracy , Humans , Models, Theoretical , Risk
18.
AMIA Annu Symp Proc ; 2014: 984-93, 2014.
Article in English | MEDLINE | ID: mdl-25954407

ABSTRACT

Collaboration and data sharing have become core elements of biomedical research. Especially when sensitive data from distributed sources are linked, privacy threats have to be considered. Statistical disclosure control allows the protection of sensitive data by introducing fuzziness. Reduction of data quality, however, needs to be balanced against gains in protection. Therefore, tools are needed which provide a good overview of the anonymization process to those responsible for data sharing. These tools require graphical interfaces and the use of intuitive and replicable methods. In addition, extensive testing, documentation and openness to reviews by the community are important. Existing publicly available software is limited in functionality, and often active support is lacking. We present ARX, an anonymization tool that i) implements a wide variety of privacy methods in a highly efficient manner, ii) provides an intuitive cross-platform graphical interface, iii) offers a programming interface for integration into other software systems, and iv) is well documented and actively supported.


Subject(s)
Computer Graphics , Confidentiality , Information Dissemination , Software , User-Computer Interface , Electronic Health Records , Humans
19.
Orphanet J Rare Dis ; 7: 66, 2012 Sep 17.
Article in English | MEDLINE | ID: mdl-22985983

ABSTRACT

We report the development of an international registry for Neurodegeneration with Brain Iron Accumulation (NBIA), in the context of TIRCON (Treat Iron-Related Childhood-Onset Neurodegeneration), an EU-FP7 - funded project. This registry aims to combine scattered resources, integrate clinical and scientific knowledge, and generate a rich source for future research studies. This paper describes the content, architecture and future utility of the registry with the intent to capture as many NBIA patients as possible and to offer comprehensive information to the international scientific community.


Subject(s)
Brain/metabolism , Iron/metabolism , Neurodegenerative Diseases/metabolism , Registries , Humans , Internationality
SELECTION OF CITATIONS
SEARCH DETAIL