Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microbatching Support #655

Open
shs037 opened this issue Jul 13, 2024 · 4 comments
Open

Microbatching Support #655

shs037 opened this issue Jul 13, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@shs037
Copy link

shs037 commented Jul 13, 2024

🚀 Feature

Support microbatch size > 1, i.e., clipping multiple (instead of one) gradients.

Motivation

We want to experiment with microbatch size > 1 for some training tasks.

(I understand that microbatch size > 1 may not improve memory / computation efficiency. This ask is more about algorithm / utility.)

Pitch

A num_microbatches parameter in make_private, similar to tf privacy.

@HuanyuZhang
Copy link
Contributor

Thanks @shs037 for bringing this to the table! We currently do not have any plan to support this function, considering its limited use case inside Meta. However, I am happy to provide/discuss about the implementation if you want to contribute a PR. One quick idea is to make changes in the optimizer function. Instead of clipping, we average first then clip.

@HuanyuZhang HuanyuZhang added the enhancement New feature or request label Jul 15, 2024
@shs037
Copy link
Author

shs037 commented Jul 17, 2024

Thanks a lot! Is it basically like changing a few lines in the function you linked?

@HuanyuZhang
Copy link
Contributor

HuanyuZhang commented Jul 21, 2024

Yeah, I think a hacky solution (without a very careful interface design) should require minimal changes. "self.grad_samples" (per_sample_gradient) is a tensor with shape = batch_size* #parameters. You just need to divide it into several microbatches, and take average for each microbatch. Perhaps you will also need to change "scale_grad" (https://github.com/pytorch/opacus/blob/main/opacus/optimizers/optimizer.py#L441) to make sure of the correctness of scale.

@HuanyuZhang
Copy link
Contributor

This approach might be problematic if you have multiple mini-batches between the two optimizer steps. But I believe it is a very rare situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants