Revision #4 Authors: Jonathan Ullman, Salil Vadhan

Accepted on: 13th July 2014 03:00

Downloads: 1024

Keywords:

Assuming the existence of one-way functions, we show that there is no

polynomial-time, differentially private algorithm $\san$ that takes a database $D\in (\{0,1\}^d)^n$ and outputs a ``synthetic database'' $\hat{D}$ all of whose two-way marginals are approximately equal to those of $D$. (A two-way marginal is the fraction of database rows $x\in (\{0,1\})^d$ with a given pair of values in a given pair of columns.) This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time $\mathrm{poly}(n,2^d)$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with encodings based on the PCP Theorem.

We also present both negative and positive results for generating ``relaxed'' synthetic data, where the fraction of rows in $D$ satisfying a predicate $c$ are estimated by applying $c$ to each row of $\hat{D}$ and aggregating the results in some way.

Revision #3 Authors: Jonathan Ullman, Salil Vadhan

Accepted on: 13th July 2014 02:59

Downloads: 990

Keywords:

Assuming the existence of one-way functions, we show that there is no

polynomial-time, differentially private algorithm $\san$ that takes a database $D\in (\{0,1\}^d)^n$ and outputs a ``synthetic database'' $\hat{D}$ all of whose two-way marginals are approximately equal to those of $D$. (A two-way marginal is the fraction of database rows $x\in (\{0,1\})^d$ with a given pair of values in a given pair of columns.) This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time $\mathrm{poly}(n,2^d)$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with encodings based on the PCP Theorem.

We also present both negative and positive results for generating ``relaxed'' synthetic data, where the fraction of rows in $D$ satisfying a predicate $c$ are estimated by applying $c$ to each row of $\hat{D}$ and aggregating the results in some way.

Revision #2 Authors: Jonathan Ullman, Salil Vadhan

Accepted on: 7th January 2011 02:17

Downloads: 1966

Keywords:

Assuming the existence of one-way functions, we show that there is no

polynomial-time, differentially private algorithm $A$ that takes a database

$D\in (\{0,1\}^d)^n$ and outputs a ``synthetic database'' $\hat{D}$ all of whose two-way

marginals are approximately equal to those of $D$. (A two-way marginal is the fraction

of database rows $x\in \{0,1\}^d$ with a given pair of values in a given pair of columns.)

This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time

$\poly(n,2^d)$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with PCP-based Levin-reductions from NP search problems to finding approximate solutions to CSPs.

Revision #1 Authors: Jonathan Ullman, Salil Vadhan

Accepted on: 6th November 2010 04:43

Downloads: 1605

Keywords:

Assuming the existence of one-way functions, we show that there is no

polynomial-time, differentially private algorithm $A$ that takes a database

$D\in (\{0,1\}^d)^n$ and outputs a ``synthetic database'' $\hat{D}$ all of whose two-way

marginals are approximately equal to those of $D$. (A two-way marginal is the fraction

of database rows $x\in \{0,1\}^d$ with a given pair of values in a given pair of columns.)

This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time

$poly(n,2^d)$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with encodings based on the PCP Theorem.

We also present both negative and positive results for generating ``relaxed'' synthetic data, where the fraction of rows in $D$ satisfying a predicate $c$ are estimated by applying $c$ to each row of $\hat{D}$ and aggregating the results in some way.

TR10-017 Authors: Jonathan Ullman, Salil Vadhan

Publication: 11th February 2010 10:59

Downloads: 1835

Keywords:

Assuming the existence of one-way functions, we show that there is no

polynomial-time, differentially private algorithm $A$ that takes a database

$D\in (\{0,1\}^d)^n$ and outputs a ``synthetic database'' $\hat{D}$ all of whose two-way

marginals are approximately equal to those of $D$. (A two-way marginal is the fraction

of database rows $x\in \{0,1\}^d$ with a given pair of values in a given pair of columns.)

This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time

$\poly(n,2^d)$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with PCP-based Levin-reductions from NP search problems to finding approximate solutions to CSPs.