Protecting Genomic Data Privacy with Probabilistic Modeling.
Pac Symp Biocomput
; 24: 403-414, 2019.
Article
en En
| MEDLINE
| ID: mdl-30963078
ABSTRACT
The proliferation of sequencing technologies in biomedical research has raised many new privacy concerns. These include concerns over the publication of aggregate data at a genomic scale (e.g. minor allele frequencies, regression coefficients). Methods such as differential privacy can overcome these concerns by providing strong privacy guarantees, but come at the cost of greatly perturbing the results of the analysis of interest. Here we investigate an alternative approach for achieving privacy-preserving aggregate genomic data sharing without the high cost to accuracy of differentially private methods. In particular, we demonstrate how other ideas from the statistical disclosure control literature (in particular, the idea of disclosure risk) can be applied to aggregate data to help ensure privacy. This is achieved by combining minimal amounts of perturbation with Bayesian statistics and Markov Chain Monte Carlo techniques. We test our technique on a GWAS dataset to demonstrate its utility in practice.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Privacidad Genética
Tipo de estudio:
Health_economic_evaluation
/
Prognostic_studies
/
Risk_factors_studies
Límite:
Humans
Idioma:
En
Revista:
Pac Symp Biocomput
Asunto de la revista:
BIOTECNOLOGIA
/
INFORMATICA MEDICA
Año:
2019
Tipo del documento:
Article