Privacy is Consent (alt: On Optimal Privacy vs. System Efficiency)

This fieldnote documents my notes with regards to a characterization of systems I’ve been calling “optimal privacy” but which intersects and overlaps with many other concepts in the literature e.g “privacy and computational complexity” or “approximate privacy”.

Starting with an observation, it is clear that privacy-preserving protocols are less efficient than non-privacy preserving protocols - that is to say that it seems that one must always communicate more to say less. (As far as I can tell, this observation was first made by Beaver[1])

This raises two interesting questions, what do we mean when we say we can compute a function privately and how does optimal-privacy relate to overall system efficiency?

Beaver[1], Chor[2], Kushilevitz[2][3] and others published much on this in the last 80s to early 90s and what follows below is a summary and rephrasing of their work.

What functions can be computed privately?

Let’s consider two parties Alice & Bob, who wish to compute a function $f(x,y)$ where $x$ is an input provided by Alice, and $y$ is an input provided by Bob.

We note that Alice could simply send $x$ to Bob (or Bob could send $y$ to Alice) and the recipient would be able to computer the function keeping their input private. We however would like to explore ways such that both can compute the function while revealing as little as possible about their input.

In the ideal case, what is generally referred to as perfect-privacy we would like to reveal only information that can be derived from either parties private input and the value of the function.

While generally left unspecified in earlier papers on this subject, I want to explicitly add one additional constraint to the definition of a perfectly private protocol:

  • The information about the private inputs that is communicated, or that can be derived, as a result of the protocol should be symmetric for both parties

Any function $f(x,y)$ can be reduced and written as a matrix $M$ where the the matrix coordinate $M(x,y) = f(x,y)$. As an example the following is a representation of an AND function, and an OR function:

$$AND(x,y) = \begin{pmatrix}0 & 0\\ 0 & 1\end{pmatrix}\,\, OR(x,y) = \begin{pmatrix}0 & 1\\ 1 & 1\end{pmatrix}$$

Given this representation we can start to construct rules and intuition about which functions can be made to be privacy preserving and which cannot.

Trivially, any matrix which is insensitive to both $x$ and $y$ can be made privacy-preserving, e.g. $\forall x,y;$ $f(x,y) = 0$:

$$f(x,y) = \begin{pmatrix}0 & 0 & \dots \\ 0 & 0 & \dots \\ \vdots & \vdots & \ddots \end{pmatrix}$$

Given such a matrix, either party can communicate the value of the function without input from each other (because the value is constant).

Partionable Functions

It is perhaps clear at this point that in order to privately compute a non-trivial function we require each party to reveal some information about their input.

Such information is necessary to partition the function into a smaller value space. Given enough information our goal is to partition the value space to produce a trivial matrix.

As an example let’s consider the function $f(x,y) = x + y\, mod \,2$, it has a function matrix as below:

$$f(x,y) = \begin{pmatrix}0 & 1 & 0 & 1 & 0 & 1 & \dots \\ 1 & 0 & 1 & 0 & 1 & 0 & \dots\\ 0 & 1 & 0 & 1 & 0 & 1 & \dots\\ 1 & 0 & 1 & 0 & 1 & 0 & \dots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$

Such a function can be partitioned based on a single bit of information e.g. whether $y$ is divisible by 2:

$$f(x,y) = \begin{pmatrix}1 & 1 & 1 & \dots\\ 0 & 0 & 0 & \dots\\ 1 & 1 & 1 & \dots \\ 0 & 0 & 0 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$

or not:

$$f(x,y) = \begin{pmatrix}0 & 0 & 0 & \dots \\ 1 & 1 & 1 & \dots\\ 0 & 0 & 0 & \dots\\ 1 & 1 & 1 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$

We can partition each matrix again, based on whether $x$ is divisible by 2 or not. Regardless of which of the above partitioned matrices we start with, the resulting matrices are the same:

$$f(x,y) = \begin{pmatrix}0 & 0 & 0 & \dots\\ 0 & 0 & 0 & \dots\\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$

$$f(x,y) = \begin{pmatrix}1 & 1 & 1 & \dots\\ 1 & 1 & 1 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$

Both resulting matrices are constant and insensitive to any other information about $x$ or $y$. We have revealed the minimal amount of information necessary to compute the (admittedly rather simple) function.

Non-partionable Functions

Before we dive deeper into the above, it is worth pointing out that not all (in fact, the majority of) functions are partionable.

Kushilevitz [3] provides us with a definition of forbidden matrices, function matrices which cannot be partitioned. For the sake of clarity, I am going to use the term non-partionable.

Intuitively, we cannot partition functions if there is no way to partition the input variable space in a way that cleanly partitions the output value space.

In our above example, we could partition both input variables by the “divisible by 2” check. (Note that in this case we could have started with either $x$ or $y$ - that is we could partition by rows or by columns.)

To understand why some functions are non-partitionable, let us start with looking at a concrete example of a function that is impossible to compute in a private manner, the AND function, defined over the input domain $[0,3)$:

$$AND(x,y) = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 0 & 1 & 1 & 1 \end{pmatrix}$$

Observe how every row and column is unique - there is no bit of information that can be given that subdivides the matrix into (logical) discrete partitions.

More formally, every potential input $x$ and $y$ is transitively related to every other input $x$ and $y$ respectively.

That is to say that every $x$ ($x_1$) shares a relation $ \sim$ with at least one other $x$ ($x_2$) such that there exists a $y$ where $M_{{x_1}y} = M_{{x_2}y}$

The equivalence relation $\equiv$ on the rows of the Matrix is defined as the transitive closure over $~$. That is, $x_1 \equiv x_2$ if there exist a set of $x_i$ such that $ x_1 \sim x_{i_1} \sim x_{i_2} \sim x_{i_3} \sim \dots \sim x_2$

A similar relation is defined over $y$ for the columns of the matrix.

We can now see that if all the rows and all the columns of a submatrix are equivalent, there can be no way to partition the submatrix i.e. for any possible partition there will always be at least one output value that is shared by the input values of multiple partitions.

(Recursively, a matrix cannot be partitioned if it contains a submatrix that cannot be partitioned.)

Going back to our AND function, we can see that this is the case, even if we just consider submatrix for the input domain [0,1]

For $y$:

$$ f(0,0) \equiv f(0,1) = 0 $$

And for $x$:

$$ f(0,0) \equiv f(1,0) = 0 $$

This means that there are no ways of partitioning the submatrix of the AND function without revealing all of $x$ and $y$ - thus it is impossible to compute AND between two parties in an information-theoretic private way (we will leave aside computing similar functions in the computational model of privacy until later)

From the above we can also observe that functions with a unique output for every set of inputs can never be computed without revealing both sets of inputs i.e. in order to maintain privacy over the input variables of Alice & Bob the function output domain must be of lower order than the input domain.

A Note on the Optimal Partitioning of Function Matrices

This fieldnote will not dive into strategies for determining the optimal partition of a given function, but by now it may have occurred to you that “partitioning matrices” isn’t exactly a trivial step. There are often multiple equivalent ways of partitioning certain functions, and both parties must agree on a given optimal partition and have a way of efficiently communicating which partition they are selecting during each round.

Additional Information & Privacy

As we noted further up, we can achieve perfect privacy for any party by sacrificing the privacy of another.

We can therefore think of optimal-privacy as the goal of minimizing the amount of information either party must reveal about their inputs.

(Optimal) Privacy can be seen as a metafunction which takes in a function matrix and outputs the minimum amount of additional information necessary to resolve the function to an arbitrary value. Bar-Yehuda, Chor and Kushilevitz  [4] proved tight bounds on the minimum amount of information that must be revealed for two-party functions.

As an example, consider the following function:

$$f(x,y) = \begin{cases} -1 & x = y \\ min(x,y) & x \neq y \end{cases}$$

We can represent the function over the input domain [0,3] as follows:

$$f(x,y) = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & -1 & 1 & 1 \\ 0 & 1 & -1 & 2 \\ 0 & 1 & 2 & -1 \end{pmatrix}$$

We note that in the above function that at least one party always learns the others input (the minimum), and both parties reveal their input in the case that the number is the same. Because of this it can be tempting to think of such a function as not-perfectly private, however, per our definition above, it is.

The matrix of the function above is partionable by revealing the most signficant bit of each input. Both Alice and Bob would take it in turns to reveal significant bits, each one would cut the value space in half.

In some of the leaf nodes (e.g. if Alice chooses 2 or 3 and Bob chooses 0 or 1 as in the bottom left of the matrix, or vice versa in the top right) we see that the matrix decomposes into two monochrome submatrices allows the party with the maximum value to retain some privacy of their input.

In others (as seen in the top left & bottom right), the matrices decompose in such a way that revealing the value, also reveals both inputs (either the values are equal, or through the process of eliminations there are no other values that either party could possess).

Sadly, in the information-theoretic model, the number of functions that are defined in the input space for our metafunction is frustratingly limited. There are simply not that many interesting functions we can compute with perfect information-theoretic privacy.

For more interesting metafunctions we are forced to make tradeoffs either by either limiting the computational power of parties (i.e. achieving privacy through the properties of certain cryptographical problems) or by accepting that we can only achieve approximate privacy.

The Millionaires Problem

Alice and Bob wish to know which one of them is richer, without revealing their wealth. They construct the following function:

$$f(x,y) = \begin{cases} 0 & x \geq y \\ 1 & x \lt y \\ \end{cases}$$

Note that the above breaks ties in favor of Alice. This forms the following function matrix:

$$f(x,y) = \begin{pmatrix} 0 & 1 & 1 & 1 & \dots \\ 0 & 0 & 1 & 1 & \dots\\ 0 & 0 & 0 & 1 & \dots \\ \vdots & \vdots & \vdots & \vdots & \ddots \\ \end{pmatrix}$$

Such a function matrix is non-partitionable, as every submatrix along the diagonal is non-partionable, as such perfect privacy is impossible.

The Bisection Protocol (as defined by Feigenbaum et al [5]) provides good privacy in the average case. Much like the strategy for the $min(x,y)$ based function above we can bit-by-bit compare the most significant bit of each of the parties inputs, and stop once the outputs from the parties differ.

The bisection protocol applied to the millionaires problem is optimally-private in respect to the party with the least amount of money: in addition to learning the most significant bit at which their wealth differs from the richer party, they also (by way of their private input) learn the lower and upper bounds on the difference of their wealth and the richer party, i.e. they learn the input of the wealthier party is the interval $[2^{n-k}, 2^n)$ where $k$ is the index of the most significant bit where both inputs differ. In contrast, the wealthier party only learns that the other has an input in the interval in the interval $[0, 2^{n-k})$

More clearly stated, the information revealed to each party by the protocol is asymmetric.

Further, the greater the difference of two inputs, the greater the asymmetry. Consider the, worst case, example where Alice has $2^n$ wealth (the maximum possible), and Bob has $0$.

In the first comparison of the Bisection Protocol, Alice will learn that they have the most wealth (thus they have computed the function), and will additionally learn that Bob’s wealth is in the interval $[0, 2^{n-1})$. Bob will also learn that Alice has more wealth, but will also learn that Alice’s wealth is the interval $[2^{n-1}, 2^n)$.

In comparison to the case where Alice and Bob have the same wealth (or near the same wealth), the protocol runs for longer and the information asymmetry approaches 0 (however this trend runs inverse to the total amount of system information both parties gain).

At this point, it is worth explicitly noting that we can greatly improve the privacy of a protocol to compute the millionaires problem if we can limit the computational power of both parties (See Yao [6]). (This also relies on the assumption that one-way functions exist)

In the case of the millionaires problem, we can use Oblivious Transfer to evaluate the inputs of a Garbled Circuit, in addition to numerous other protocols. Such protocols will ensure that each party learns only the value of the function (under the limited computational power & one-way functions assumptions).

We have finally reached the point in this fieldnote where we can start considering the second question: how does achieving optimal-privacy relate to the overall communication efficiency of the system?

Regardless of whether a function can be computed with perfect-privacy (or the assumptions made about computational power) the communication complexity of the number of rounds associated with the optimally private computation is dramatically more that a non-private computation.

As an example, a non-private computation of the millionaires problem can be computed with a single round of communication (where each party transmits once). The bisection protocol requires at-least one round of communication, and at worst $n$ rounds of communication ($n = \log_{2}N$ where $N$ is the maximum value of the input domain). The simplest oblivious transfer protocol [7] on the other hand requires $1.5$ rounds to transmit a single bit (thus requires $1.5 \cdot n$ rounds of communication, not including the communication required to setup the garbled circuit).

We have to engage in much more conversation, to transmit less information.

More formally, we know that if a function can be computed perfectly private it can be computed in at worst $2 \cdot 2^n$ rounds (See [1]), we also know that we can trade additional information for improvements on communication complexity ([4] - that is, we can sacrifice privacy to gain efficiency. We can choose to give up more information about our (private) inputs to improve the efficiency of the system.

Thus we hit upon a philosophical notion of the nature of privacy & consent. Consent is the degree to which you are willing to reveal additional information to improve the efficiency of a system. Privacy is the degree to which you are unwilling to reveal additional information. Privacy and Consent are poles on the same spectrum. Privacy is negative consent, Consent is negative privacy.

Privacy is Consent

References

1. Beaver, Donald. Perfect privacy for two-party protocols. Harvard University, Center for Research in Computing Technology, Aiken Computation Laboratory, 1989.
2. Chor, Benny, and Eyal Kushilevitz. “A zero-one law for boolean privacy.” SIAM Journal on Discrete Mathematics 4.1 (1991): 36-47.
3. Kushelvitz, Eyal. “Privacy and communication complexity.” SIAM Journal on Discrete Mathematics 5.2 (1992): 273-284.
4. R. Bar-Yehuda, B. Chor, E. Kushilevitz and A. Orlitsky, “Privacy, additional information and communication,” in IEEE Transactions on Information Theory, vol. 39, no. 6, pp. 1930-1943, Nov. 1993. doi: 10.110918.265501
5. Feigenbaum, Joan, Aaron D. Jaggard, and Michael Schapira. “Approximate privacy: foundations and quantification.” Proceedings of the 11th ACM conference on Electronic commerce. ACM, 2010.
6. A. C. Yao, “Protocols for secure computations,” 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982)(FOCS), vol. 00, no. , pp. 160-164, 1982. doi:10.1109/SFCS.1982.88
7. Chou, Tung, and Claudio Orlandi. “The simplest protocol for oblivious transfer.” International Conference on Cryptology and Information Security in Latin America. Springer, Cham, 2015.