-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hidden randomness in defenses #65
Comments
We haven't carefully considered this yet. I would be partial to saying that a defense should work with any random seed, but that it is free to choose a fresh seed every time it classifies an image. If we instead allow the defense to only work with one seed the defender knows and the attacker doesn't, we're no longer in a fully white-box threat model: the defender now gets to hold something secret. But I think it would be worth discussing this to make sure there aren't any unintended consequences. Can you think of a defense where it makes sense to only work for one random seed but not others? |
We have been testing a specific defense idea leveraging private randomness which I've emailed you about privately. Please let me know if you'd prefer to keep the rules discussion on this thread, in which case I'll try to rephrase our idea in a less specific way. |
Let me take a look at your email. |
I've been giving this some thought. I'm inclined to say "no" that defenses must work with an arbitrary seed. If we allow defenses to have a secret seed, then what's to say that they don't use this to initialize some weights of the neural network and now we have a grey-box threat model which we want explicitly to avoid. @catherio @nottombrown do you have any thoughts? |
That's my inclination, too, but maybe you could forward the email so I can think about this specific case? |
Ok, having read this, I agree with @carlini. The randomness is be viewed as coming from "the world"; the defender has to accept what it is given, and work well under all such situations. |
The contest proposal states:
We have a few clarifying questions:
The text was updated successfully, but these errors were encountered: