-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add discrete Laplace noise w/ SumVec and Histogram #3302
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like straightforward plumbing changes.
@@ -443,3 +458,60 @@ async fn janus_in_process_one_round_with_agg_param_time_interval() { | |||
) | |||
.await; | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Might as well check the plumbing of Prio3SumVec with a test, i.e. janus_in_process_sumvec_dp_noise()
?
let mut un_noised_result = [0u128; HISTOGRAM_LENGTH]; | ||
un_noised_result[0] = report_count.into(); | ||
// Smoke test: Just confirm that some noise was added. Since epsilon is small, the noise will be | ||
// large (drawn from Laplace_Z(20) + Laplace_Z(20)), and it is highly unlikely that all 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naive: what is _Z
? (you can point me to a paper :D).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lo-fi version of the notation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also naive: there is no bound on the noise that could be applied to the result, yes? I'm mildly uncomfortable that the set of all valid results for this test is "all integers except for n
", but seeing as the test fails if I turn off DP, I'm fine with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the noise value could be any integer, thus we could be adding any field elements to the shares. (Though the probabilities become astronomically low once you get many multiples of the scale parameter out)
I also did a negative test where I cranked epsilon way up (to u128::MAX
) and confirmed that the test failed then. Even though it was still adding noise, the noise was all zeros because of the absurd parameter choice.
We could improve this with some sort of statistical test that takes each dimension of the noise vector as one sample of the noise distribution, and then performs a statistical test to see how good of a fit it is for the desired distribution. (with a low enough false positive rate to not cause problems) I filed #3325 to track this as a follow-up. It'll be a bit tricky because the distribution of the sum of two numbers drawn from base distributions is the convolution of those distributions, plus I'll need to read up on statistical tests.
This closes #3301. Depends on divviup/libprio-rs#1072.