Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prio3: Add guidance for choosing PROOFS and Flp.Field #311

Closed
cjpatton opened this issue Nov 16, 2023 · 8 comments · Fixed by #325
Closed

Prio3: Add guidance for choosing PROOFS and Flp.Field #311

cjpatton opened this issue Nov 16, 2023 · 8 comments · Fixed by #325
Assignees

Comments

@cjpatton
Copy link
Collaborator

cjpatton commented Nov 16, 2023

If a small field is used, then it may be necessary to use a larger number of proofs in order to achieve the desired level of robustness. (Privacy is not significantly impacted by the choice of these parameters.) The draft needs to provide some guidance for choosing this.

We may also consider re-parameterizing the existing Prio3 variants.

See #177 for initial discussion and analysis.

cc/ @albertpl

@cjpatton
Copy link
Collaborator Author

cjpatton commented Dec 5, 2023

(REVISED, see comment below.)

I've written a sage script (#318) for plotting robustness of Prio3SumVec for various choices of field and number of proofs. The generated plot is below. What it shows is an upper bound of the probability that $1$ in $1,000,000,000$ accepted reports is invalid, given a amount of computational resources ($2^{80}$ random oracle queries, which is a pretty comfortable margin). The x-axis shows the input length; the number of bits is $1$; and the chunk length is set to the asymptotic optimal (square root of the length).

prio3_sum_vec

  • The baseline is Field128/1 (one proof). The probability of a robustness error is definitely non-negligible (where "negligible" is approximately the probability of correctly guessing an AES key), but it's still quite unlikely to happen. For instance, for input size = $100,000$, the probability that $1$ in every $1$ billion accepted reports is invalid is at most $2^{-36}$. This is the same as the probability of ruining $1$ in every $100,000$ batches assuming a batch size of $10,000$.
  • Field64/1 is quite terrible and should not be deployed.
  • Field64/2 is worse than Field128/1, but may be tolerable depending on the application.
  • Field64/3 is comfortably better than Field128/1.
  • Field64/4 is much more robust than we'd ever need.

Based on this analysis, I'd suggest the following text change. Top of Section 7.4:

For circuits for which the number of gadget calls grows sub-linearily in the parameters (e.g., Sum, SumVec, or Histogram), a smaller field can be used as long as additional proofs are generated and verified. When replacing Field128 with Field64 it is RECOMMENDED to use 3 proofs. However, some applications may tolerate as few as 2.

@albertpl
Copy link
Contributor

albertpl commented Dec 6, 2023

This assumes the adversary donates 1000000000 reports? either control the number of devices or break authentication / rate limit.

@wangshan
Copy link
Contributor

wangshan commented Dec 6, 2023

is there an explanation that Field64/1 prob is > 1?

@cjpatton
Copy link
Collaborator Author

cjpatton commented Dec 6, 2023

@albertpl This assumes the adversary donates 1000000000 reports? either control the number of devices or break authentication / rate limit.

The bond actually says something slightly different: it tells us the maximum probability that, among 1 billion reports that we accepted, at least one of them is invalid. In other words, we must include benign (i.e., not malicious reports) as part of the attack. (Intuitively, this is because VDAF preparation depends on a secret key, and information about this key gets exposed to the adversary over the course of its attack. For details, see the robustness game in ia.cr/2023/130, Figure 3.)

@wangshan is there an explanation that Field64/1 prob is > 1?

The thing that's plotted is an upperbound of the probability, not the probability. If the bounds is > 1, then it's called "vacuous", meaning it tells us nothing about the chances of a successful attack. The way to interpret this is: the parameters are too weak to provide meaningful robustness.

@cjpatton
Copy link
Collaborator Author

cjpatton commented Dec 6, 2023

Thanks @junyechen1996 and @albertpl for your feedback on the PR. Due primarily to #318 (comment), we need to revise the bounds:

prio3_sum_vec_revised

  • Field128/1 (baseline): For input size = $100,000$, the probability that $1$ in every $1$ billion accepted reports is invalid is at most $2^{-29}$.
  • Field64/2 is significantly worse than Field128/1, but may be tolerable for some applications and for small input sizes.
  • Field64/3 is comfortably better than Field128/1.

I'd revise the text change as follows:

For some circuits, a smaller field can be used safely as long as additional proofs are generated and verified. When replacing Field128 with Field64 it is RECOMMENDED to use 3 proofs. However, some applications may tolerate as few as 2.

@cjpatton
Copy link
Collaborator Author

cjpatton commented Dec 7, 2023

Slight improvement after fixing a typo pointed out by @junyechen1996:

prio3_sum_vec_revised2

@bwesterb
Copy link

bwesterb commented Dec 8, 2023

To help with the tightness of the bound, here is a simple concrete attack. I do lack quite a bit of context, so I hope it's not too far off the actual problem.

From what I understand the goal is to find field elements x_1, ..., x_n such that p(r) = 0, where r = H(x_1, ..., x_n) and p(s) = sum_i=1^n x_i (1 - x_i) x^(i-1), but where x_i not in {0,1} for some i.

For the moment assume n is even, and there is a nth degree primitive root of unity zeta in the field — that is: there is a zeta with zeta^n = 1 and zeta^i ≠ 1 for all 0 ≤ i < n.

Recall two basic facts about such a PROU:

  • zeta + zeta^2 + ... + zeta^n = 0, because for that sum S, we have (zeta - 1)S = zeta^(n+1) - zeta = 0, and zeta ≠ 1.

  • In fact, for any 0<i<n, we have zeta^i + zeta^(2i) + ... + zeta^(ni) = 0. Indeed, for this sum S, we have (zeta^i - 1) S = zeta^((n+1)i) - zeta^i = zeta^i - zeta^i = 0. As zeta^i ≠ 1 by definition of PROU, we have S = 0 as desired.

  • zeta, zeta^2, ..., zeta^n are all distinct, for if zeta^i = zeta^j with WLOG i > j, then zeta^(i-j)=1 contradicting n being a PROU.

Now, pick a random field element y not in {0,1}, and set x_1 = 0, x_2 = y, x_3 = y, ..., x_n = y.

Then for any 0 < i < n, we have p(zeta^i) = y(1-y) ( zeta^i + zeta^(2i) + ... + zeta^(ni) ) = 0.

Thus, first precompute Z = { zeta^1, ..., zeta^(n-1) }.

Then search for an y ≠ 0, 1 such that H( 0, y, y, ..., y ) in Z.

When found, we have broken robustness. Note that in any interesting case, essentially all time of the attack is spent in computing H.

With n = 2^20, and we're using a 64 bit field, we expect only 2^44 calls to H. That's practical.

When requiring two proofs for that 64 bit field, we're looking at 2^88 calls to H. That's possible to pull of, but very expensive (>10M$).

@cjpatton
Copy link
Collaborator Author

cjpatton commented Dec 16, 2023

To help with the tightness of the bound, here is a simple concrete attack. I do lack quite a bit of context, so I hope it's not too far off the actual problem.

From what I understand the goal is to find field elements x_1, ..., x_n such that p(r) = 0, where r = H(x_1, ..., x_n) and p(s) = sum_i=1^n x_i (1 - x_i) x^(i-1), but where x_i not in {0,1} for some i.

This is roughly the idea, but let me make it a little more precise. (@divergentdave please fix any bugs that you see.) We're attacking Prio3SumVec, whose circuit $C$ takes as input a vector $\vec x = (x_1, \ldots, x_n)$ and random input $r$ and outputs:

$C(\vec x, r) = \Sigma_{i=1}^{n}{r^i x_i (x_i - 1)}$

We can think of $x_i(x_i-1)$ as the $i$-th coefficient of a polynomial $p$ and computing $p(r)$:

  • If $x_i \in \{0,1\}$ then $C(\vec x, r) = 0$ for all $r$;
  • If $x_i \not\in \{0,1\}$ for some $i$ then $(\vec x, r) = 0$ with probability at most $\epsilon = n/|\mathbb{F}|$, assuming $r$ was chosen fairly. (This is because the maximum number of roots of $p$ is $n$.)

In fact, $r$ is not sampled, but derived, roughly as you've described. In more detail, the Client:

  • secret-shares the vector $\vec x$ into $[[\vec x]]_1, \ldots, [[\vec x]]_s$, where $s$ is the number of Aggregators
  • samples random "blinds" $B_1, \ldots, B_s$
  • computes $r = H([[\vec x]]_1, ..., [[\vec x]]_s, B_1, \ldots, B_s)$ so that the Aggregators can reconstruct $r$ without sharing their shares or blinds.

The question is how much computation do we have to do in order to break robustness of Prio3SumVec with advantage significantly better than $\epsilon$.

In fact, if we have multiple proofs, then we evaluate $C$ multiple times on independent random inputs output by $H$. In this case we need to be $\epsilon^k$ where $k$ is the number of proofs.

For the moment assume n is even, and there is a nth degree primitive root of unity zeta in the field — that is: there is a zeta with zeta^n = 1 and zeta^i ≠ 1 for all 0 ≤ i < n.

This is the case for us :)

Recall two basic facts about such a PROU:

* zeta + zeta^2 + ... + zeta^n = 0, because for that sum S, we have (zeta - 1)S = zeta^(n+1) - zeta = 0, and zeta ≠ 1.

* In fact, for any 0<i<n, we have zeta^i + zeta^(2i) + ... + zeta^(ni) = 0. Indeed, for this sum S, we have (zeta^i - 1) S = zeta^((n+1)i) - zeta^i = zeta^i - zeta^i = 0. As zeta^i ≠ 1 by definition of PROU, we have S = 0 as desired.

* zeta, zeta^2, ..., zeta^n are all distinct, for if zeta^i = zeta^j with WLOG i > j, then zeta^(i-j)=1 contradicting n being a PROU.

❤️

Now, pick a random field element y not in {0,1}, and set x_1 = 0, x_2 = y, x_3 = y, ..., x_n = y.

Then for any 0 < i < n, we have p(zeta^i) = y(1-y) ( zeta^i + zeta^(2i) + ... + zeta^(ni) ) = 0.

Thus, first precompute Z = { zeta^1, ..., zeta^(n-1) }.

Then search for an y ≠ 0, 1 such that H( 0, y, y, ..., y ) in Z.

I.e., fix coins for secret sharing and generating blinds and for each $y \in \mathbb{F}\setminus \{0,1\}$ and compute $r = H([[\vec x]]_1, ..., [[\vec x]]_s, B_1, \ldots, B_s)$. Stop when $r \in Z$.

When found, we have broken robustness. Note that in any interesting case, essentially all time of the attack is spent in computing H.

With n = 2^20, and we're using a 64 bit field, we expect only 2^44 calls to H. That's practical.

When requiring two proofs for that 64 bit field, we're looking at 2^88 calls to H. That's possible to pull of, but very expensive (>10M$).

I'd like to come up with a formula. Modeling $H$ as a random oracles it's just a series of independent Bernoulli's with probability $p = (|Z|/|\mathbb{F}|)^k = (n/|\mathbb{F}|)^k$ and we go until one of them succeeds.

That means the success probability after $q$ random oracle queries is $(1 - p)^q \cdot p$, and the expected number of random oracles queries we have to make is $1/p = (|\mathbb{F}|/n)^k$.

This confirms your math. It does seem like you could pull off $2^{88}$ evaluations of $H$ but it would require immense resources. For scale, here's a website that estimates humanity's Bitcoin-mining capacity, which is a pretty good approximation of the resources we have to break Prio: https://www.coinwarz.com/mining/bitcoin/hashrate-chart

For the record, Field128 with one proof and Field64 with three proofs are both definitely safe (expected $q$ of $2^{108}$ and $2^{132}$ respectively).

cjpatton added a commit that referenced this issue Jan 11, 2024
For Prio3, when joint randomness is used:

* RECOMMEND Field128
* Field64 MAY be used, but `PROOFS` MUST be `3`

The latter is motivated by issue #177. The recommendation is based on
the upper bound given by {{DPRS23}}, Theorem 1 and a matching attack
described by Bas Westerbaan (see issue #311).
cjpatton added a commit that referenced this issue Jan 11, 2024
For Prio3, when joint randomness is used:

* RECOMMEND Field128
* Field64 MAY be used, but `PROOFS` MUST be `3`

The latter is motivated by issue #177. The recommendation is based on
the upper bound given by {{DPRS23}}, Theorem 1 and a matching attack
described by Bas Westerbaan (see issue #311).
cjpatton added a commit that referenced this issue Jan 23, 2024
For Prio3, when joint randomness is used:

* RECOMMEND Field128
* Field64 MAY be used, but `PROOFS` MUST be `3`

The latter is motivated by issue #177. The recommendation is based on
the upper bound given by {{DPRS23}}, Theorem 1 and a matching attack
described by Bas Westerbaan (see issue #311).

Co-authored-by: Shan <[email protected]>
cjpatton added a commit that referenced this issue Jan 23, 2024
For Prio3, when joint randomness is used:

* RECOMMEND Field128
* Field64 MAY be used, but `PROOFS` MUST be `3`

The latter is motivated by issue #177. The recommendation is based on
the upper bound given by {{DPRS23}}, Theorem 1 and a matching attack
described by Bas Westerbaan (see issue #311).

Co-authored-by: Shan <[email protected]>
cjpatton added a commit that referenced this issue Jan 23, 2024
For Prio3, when joint randomness is used:

* RECOMMEND Field128
* Field64 MAY be used, but `PROOFS` MUST be `3`

The latter is motivated by issue #177. The recommendation is based on
the upper bound given by {{DPRS23}}, Theorem 1 and a matching attack
described by Bas Westerbaan (see issue #311).

Co-authored-by: Shan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants