Search | VHL Regional Portal

Response to comment on "Policy impacts of statistical uncertainty and privacy".

Steed, Ryan; Acquisti, Alessandro; Wu, Zhiwei Steven; Liu, Terrance.

Science ; 380(6648): eadh2297, 2023 Jun 02.

Article in English | MEDLINE | ID: mdl-37262138

ABSTRACT

We offer our thanks to the authors for their thoughtful comments. Cui, Gong, Hannig, and Hoffman propose a valuable improvement to our method of estimating lost entitlements due to data error. Because we don't have access to the unknown, "true" number of children in poverty, our paper simulates data error by drawing counterfactual estimates from a normal distribution around the official, published poverty estimates, which we use to calculate lost entitlements relative to the official allocation of funds. But, if we make the more realistic assumption that the published estimates are themselves normally distributed around the "true" number of children in poverty, Cui et al.'s proposed framework allows us to reliably estimate lost entitlements relative to the unknown, ideal allocation of funds-what districts would have received if we knew the "true" number of children in poverty.

Reply to Sanchéz et al.: Multiplicity does not protect privacy.

Dick, Travis; Dwork, Cynthia; Kearns, Michael; Liu, Terrance; Roth, Aaron; Vietri, Giuseppe; Wu, Zhiwei Steven.

Proc Natl Acad Sci U S A ; 120(18): e2304263120, 2023 05 02.

Article in English | MEDLINE | ID: mdl-37094130

Subject(s)

Confidentiality , Privacy

Confidence-ranked reconstruction of census microdata from published statistics.

Dick, Travis; Dwork, Cynthia; Kearns, Michael; Liu, Terrance; Roth, Aaron; Vietri, Giuseppe; Wu, Zhiwei Steven.

Proc Natl Acad Sci U S A ; 120(8): e2218605120, 2023 Feb 21.

Article in English | MEDLINE | ID: mdl-36800385

ABSTRACT

A reconstruction attack on a private dataset D takes as input some publicly accessible information about the dataset and produces a list of candidate elements of D. We introduce a class of data reconstruction attacks based on randomized methods for nonconvex optimization. We empirically demonstrate that our attacks can not only reconstruct full rows of D from aggregate query statistics Q(D)∈âm but can do so in a way that reliably ranks reconstructed rows by their odds of appearing in the private data, providing a signature that could be used for prioritizing reconstructed rows for further actions such as identity theft or hate crime. We also design a sequence of baselines for evaluating reconstruction attacks. Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset D was sampled, demonstrating that they are exploiting information in the aggregate statistics Q(D) and not simply the overall structure of the distribution. In other words, the queries Q(D) are permitting reconstruction of elements of this dataset, not the distribution from which D was drawn. These findings are established both on 2010 US decennial Census data and queries and Census-derived American Community Survey datasets. Taken together, our methods and experiments illustrate the risks in releasing numerically precise aggregate statistics of a large dataset and provide further motivation for the careful application of provably private techniques such as differential privacy.

Policy impacts of statistical uncertainty and privacy.

Steed, Ryan; Liu, Terrance; Wu, Zhiwei Steven; Acquisti, Alessandro.

Science ; 377(6609): 928-931, 2022 08 26.

Article in English | MEDLINE | ID: mdl-36007047

ABSTRACT

Funding formula reform may help address unequal impacts of uncertainty from data error and privacy protections.

Subject(s)

Censuses , Policy , Privacy , Humans , Uncertainty

Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing.

Beaulieu-Jones, Brett K; Wu, Zhiwei Steven; Williams, Chris; Lee, Ran; Bhavnani, Sanjeev P; Byrd, James Brian; Greene, Casey S.

Circ Cardiovasc Qual Outcomes ; 12(7): e005122, 2019 07.

Article in English | MEDLINE | ID: mdl-31284738

ABSTRACT

BACKGROUND: Data sharing accelerates scientific progress but sharing individual-level data while preserving patient privacy presents a barrier. METHODS AND RESULTS: Using pairs of deep neural networks, we generated simulated, synthetic participants that closely resemble participants of the SPRINT trial (Systolic Blood Pressure Trial). We showed that such paired networks can be trained with differential privacy, a formal privacy framework that limits the likelihood that queries of the synthetic participants' data could identify a real a participant in the trial. Machine learning predictors built on the synthetic population generalize to the original data set. This finding suggests that the synthetic data can be shared with others, enabling them to perform hypothesis-generating analyses as though they had the original trial data. CONCLUSIONS: Deep neural networks that generate synthetic participants facilitate secondary analyses and reproducible investigation of clinical data sets by enhancing data sharing while preserving participant privacy.

Subject(s)

Computer Security , Confidentiality , Deep Learning , Information Dissemination/methods , Antihypertensive Agents/therapeutic use , Blood Pressure/drug effects , Computer Simulation , Data Collection , Humans , Hypertension/diagnosis , Hypertension/drug therapy , Hypertension/physiopathology , Randomized Controlled Trials as Topic , Treatment Outcome

Private algorithms for the protected in social network search.

Kearns, Michael; Roth, Aaron; Wu, Zhiwei Steven; Yaroslavtsev, Grigory.

Proc Natl Acad Sci U S A ; 113(4): 913-8, 2016 Jan 26.

Article in English | MEDLINE | ID: mdl-26755606

ABSTRACT

Motivated by tensions between data privacy for individual citizens and societal priorities such as counterterrorism and the containment of infectious disease, we introduce a computational model that distinguishes between parties for whom privacy is explicitly protected, and those for whom it is not (the targeted subpopulation). The goal is the development of algorithms that can effectively identify and take action upon members of the targeted subpopulation in a way that minimally compromises the privacy of the protected, while simultaneously limiting the expense of distinguishing members of the two groups via costly mechanisms such as surveillance, background checks, or medical testing. Within this framework, we provide provably privacy-preserving algorithms for targeted search in social networks. These algorithms are natural variants of common graph search methods, and ensure privacy for the protected by the careful injection of noise in the prioritization of potential targets. We validate the utility of our algorithms with extensive computational experiments on two large-scale social network datasets.

Subject(s)

Algorithms , Confidentiality , Social Networking , Computer Simulation , Humans

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL