Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
NDDS Symp ; 20232023.
Artigo em Inglês | MEDLINE | ID: mdl-37275390

RESUMO

When sharing relational databases with other parties, in addition to providing high quality (utility) database to the recipients, a database owner also aims to have (i) privacy guarantees for the data entries and (ii) liability guarantees (via fingerprinting) in case of unauthorized redistribution. However, (i) and (ii) are orthogonal objectives, because when sharing a database with multiple recipients, privacy via data sanitization requires adding noise once (and sharing the same noisy version with all recipients), whereas liability via unique fingerprint insertion requires adding different noises to each shared copy to distinguish all recipients. Although achieving (i) and (ii) together is possible in a naïve way (e.g., either differentially-private database perturbation or synthesis followed by fingerprinting), this approach results in significant degradation in the utility of shared databases. In this paper, we achieve privacy and liability guarantees simultaneously by proposing a novel entry-level differentially-private (DP) fingerprinting mechanism for relational databases without causing large utility degradation. The proposed mechanism fulfills the privacy and liability requirements by leveraging the randomization nature of fingerprinting and transforming it into provable privacy guarantees. Specifically, we devise a bit-level random response scheme to achieve differential privacy guarantee for arbitrary data entries when sharing the entire database, and then, based on this, we develop an ϵ-entry-level DP fingerprinting mechanism. We theoretically analyze the connections between privacy, fingerprint robustness, and database utility by deriving closed form expressions. We also propose a sparse vector technique-based solution to control the cumulative privacy loss when fingerprinted copies of a database are shared with multiple recipients. We experimentally show that our mechanism achieves strong fingerprint robustness (e.g., the fingerprint cannot be compromised even if the malicious database recipient modifies/distorts more than half of the entries in its received fingerprinted copy), and higher database utility compared to various baseline methods (e.g., application-dependent database utility of the shared database achieved by the proposed mechanism is higher than that of the considered baselines).

2.
IEEE Trans Dependable Secure Comput ; 20(4): 2939-2953, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38384377

RESUMO

Database fingerprinting is widely adopted to prevent unauthorized data sharing and identify source of data leakages. Although existing schemes are robust against common attacks, their robustness degrades significantly if attackers utilize inherent correlations among database entries. In this paper, we demonstrate the vulnerability of existing schemes by identifying different correlation attacks: column-wise correlation attack, row-wise correlation attack, and their integration. We provide robust fingerprinting against these attacks by developing mitigation techniques, which can work as post-processing steps for any off-the-shelf database fingerprinting schemes and preserve the utility of databases. We investigate the impact of correlation attacks and the performance of mitigation techniques using a real-world database. Our results show (i) high success rates of correlation attacks against existing fingerprinting schemes (e.g., integrated correlation attack can distort 64.8% fingerprint bits by just modifying 14.2% entries in a fingerprinted database), and (ii) high robustness of mitigation techniques (e.g., after mitigation, integrated correlation attack can only distort 3% fingerprint bits). Additionally, the mitigation techniques effectively alleviate correlation attacks even if (i) attackers have access to correlation models directly computed from the original database, while the database owner uses inaccurate correlation models, (ii) or attackers utilizes higher order of correlations than the database owner.

3.
Bioinformatics ; 38(Suppl 1): i143-i152, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758787

RESUMO

MOTIVATION: Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel's law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks. RESULTS: Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP-phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP-phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/xiutianxi/robust-genomic-fp-github.


Assuntos
Algoritmos , Genômica , Bases de Dados Factuais
4.
CODASPY ; 2022: 77-88, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35531063

RESUMO

Privacy-preserving genomic data sharing is prominent to increase the pace of genomic research, and hence to pave the way towards personalized genomic medicine. In this paper, we introduce (ϵ, T)-dependent local differential privacy (LDP) for privacy-preserving sharing of correlated data and propose a genomic data sharing mechanism under this privacy definition. We first show that the original definition of LDP is not suitable for genomic data sharing, and then we propose a new mechanism to share genomic data. The proposed mechanism considers the correlations in data during data sharing, eliminates statistically unlikely data values beforehand, and adjusts the probability distributions for each shared data point accordingly. By doing so, we show that we can avoid an attacker from inferring the correct values of the shared data points by utilizing the correlations in the data. By adjusting the probability distributions of the shared states of each data point, we also improve the utility of shared data for the data collector. Furthermore, we develop a greedy algorithm that strategically identifies the processing order of the shared data points with the aim of maximizing the utility of the shared data. Our evaluation results on a real-life genomic dataset show the superiority of the proposed mechanism compared to the randomized response mechanism (a widely used technique to achieve LDP).

5.
IEEE Internet Things J ; 8(21): 15953-15964, 2021 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-35782188

RESUMO

The coronavirus disease 2019 (COVID-19) has rapidly become a significant public health emergency all over the world since it was first identified in Wuhan, China, in December 2019. Until today, massive disease-related data have been collected, both manually and through the Internet of Medical Things (IoMT), which can be potentially used to analyze the spread of the disease. On the other hand, with the help of IoMT, the analysis results of the current status of COVID-19 can be delivered to people in real time to enable situational awareness, which may help mitigate the disease spread in communities. However, current accessible data on COVID-19 are mostly at a macrolevel, such as for each state, county, or metropolitan area. For fine-grained areas, such as for each city, community, or geographical coordinate, COVID-19 data are usually not available, which prevents us from obtaining information on the disease spread in closer neighborhoods around us. To address this problem, in this article, we propose a two-level risk assessment system. In particular, we define a "risk index." Then, we develop a risk assessment model, called MK-DNN, by taking advantage of the multikernel density estimation (MKDE) and deep neural network (DNN). We train MK-DNN at the macrolevel (for each metro area), which subsequently enables us to obtain the risk indices at the microlevel (for each geographic coordinate). Moreover, a heuristic validation method is further designed to help validate the obtained microlevel risk indices. Simulations conducted on real-world data demonstrate the accuracy and validity of our proposed risk assessment system.

6.
Artigo em Inglês | MEDLINE | ID: mdl-37964942

RESUMO

Database fingerprinting have been widely adopted to prevent unauthorized sharing of data and identify the source of data leakages. Although existing schemes are robust against common attacks, like random bit flipping and subset attack, their robustness degrades significantly if attackers utilize the inherent correlations among database entries. In this paper, we first demonstrate the vulnerability of existing database fingerprinting schemes by identifying different correlation attacks: column-wise correlation attack, row-wise correlation attack, and the integration of them. To provide robust fingerprinting against the identified correlation attacks, we then develop mitigation techniques, which can work as post-processing steps for any off-the-shelf database fingerprinting schemes. The proposed mitigation techniques also preserve the utility of the fingerprinted database considering different utility metrics. We empirically investigate the impact of the identified correlation attacks and the performance of mitigation techniques using real-world relational databases. Our results show (i) high success rates of the identified correlation attacks against existing fingerprinting schemes (e.g., the integrated correlation attack can distort 64.8% fingerprint bits by just modifying 14.2% entries in a fingerprinted database), and (ii) high robustness of the proposed mitigation techniques (e.g., with the mitigation techniques, the integrated correlation attack can only distort 3% fingerprint bits). Furthermore, we show that the proposed mitigation techniques effectively alleviate correlation attacks even if the attacker has access to the correlation models that are directly calculated from the database.

7.
Proc ACM Workshop Priv Electron Soc ; 2020: 163-179, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34485998

RESUMO

Although genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is inspired from differential privacy for sharing individuals' genomic data while preserving their privacy. We assume an individual with some sensitive portion on her genome (e.g., mutations or single nucleotide polymorphisms - SNPs that reveal sensitive information about the individual) that she does not want to share. The goals of the individual are to (i) preserve the privacy of her sensitive data (considering the correlations between the sensitive and non-sensitive part), (ii) preserve the privacy of interdependent data (data that belongs to other individuals that is correlated with her data), and (iii) share as much non-sensitive data as possible to maximize utility of data sharing. As opposed to traditional differential privacy-based data sharing schemes, the proposed scheme does not intentionally add noise to data; it is based on selective sharing of data points. We observe that traditional differential privacy concept does not capture sharing data in such a setting, and hence we first introduce a privacy notation, ϵ-indirect privacy, that addresses data sharing in such settings. We show that the proposed framework does not provide sensitive information to the attacker while it provides a high data sharing utility. We also compare the proposed technique with the previous ones and show our advantage both in terms of privacy and data sharing utility.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA