Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opacus does not work on provided example for GAN and is broken on GANs in general #523

Open
gianmarcoaversanoenx opened this issue Oct 17, 2022 · 1 comment

Comments

@gianmarcoaversanoenx
Copy link

gianmarcoaversanoenx commented Oct 17, 2022

🐛 Bug

I tried running this script and an error was raised.

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 1] because the unspecified dimension size -1 can be any value and is ambiguous

I actually have been trying using Opacus for other model that require multiple optimizers, using either plain PyTorch or Lightning, and have always failed so far. I can run Opacus with PyTorch Lightning (using a Callback) only when there is one optimizer.

Expected behavior

No error is raised.

An example with simple model using two optimizers

This will work because of manual optimization.

from loguru import logger
import os
import warnings

import pytorch_lightning as pl
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.optim.optimizer import Optimizer
import torchmetrics
from opacus import PrivacyEngine, GradSampleModule
from opacus.accountants import RDPAccountant
from opacus.data_loader import DPDataLoader
from opacus.optimizers import DPOptimizer
from opacus.lightning import DPLightningDataModule
from pl_bolts.datamodules import MNISTDataModule
from pytorch_lightning.utilities.cli import LightningCLI


class LitSampleConvNetClassifier(pl.LightningModule):
    def __init__(
        self,
        lr: float = 0.1,
        enable_dp: bool = True,
        delta: float = 1e-5,
        noise_multiplier: float = 1.0,
        max_grad_norm: float = 1.0,
    ):
        """A simple conv-net for classifying MNIST with differential privacy training
        Args:
            lr: Learning rate
            enable_dp: Enables training with privacy guarantees using Opacus (if True), vanilla SGD otherwise
            delta: Target delta for which (eps, delta)-DP is computed
            noise_multiplier: Noise multiplier
            max_grad_norm: Clip per-sample gradients to this norm
        """
        super().__init__()
        # Hyper-parameters
        self.lr = lr
        # Parameters
        self.model = torch.nn.Sequential(
            torch.nn.Conv2d(1, 16, 8, 2, padding=3),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(2, 1), 
            torch.nn.Conv2d(16, 32, 4, 2), 
            torch.nn.ReLU(), 
            torch.nn.MaxPool2d(2, 1), 
            torch.nn.Flatten(), 
            torch.nn.Linear(32 * 4 * 4, 32), 
            torch.nn.ReLU(), 
            torch.nn.Linear(32, 10)
        )
        # Metrics
        self.test_accuracy = torchmetrics.Accuracy()
        # Differential privacy
        self.accountant = RDPAccountant()
        self.enable_dp = enable_dp
        self.delta = delta
        self.noise_multiplier = noise_multiplier
        self.max_grad_norm = max_grad_norm
        # Important: This property activates manual optimization.
        self.automatic_optimization = False

    def forward(self, x):
        return self.model(x)

    def configure_optimizers(self):
        print("Configuring optimizers...")
        parameters = list(self.model.parameters())
        params1 = parameters[:3]
        params2 = parameters[3:]
        optimizers = [
            torch.optim.SGD(params1, lr=0.05),
            torch.optim.SGD(params2, lr=0.05),
        ]
        # privacy
        if not isinstance(self.model, GradSampleModule):
            self.model = GradSampleModule(self.model)
        data_loader = self.trainer._data_connector._train_dataloader_source.dataloader()
        sample_rate: float = 1 / len(data_loader)
        dataset_size: int = len(data_loader.dataset)  # type: ignore
        expected_batch_size = int(dataset_size * sample_rate)
        for i, optim in enumerate(optimizers):
            optim = DPOptimizer(
                optimizer=optim,
                noise_multiplier=self.noise_multiplier,
                max_grad_norm=1.0,
                expected_batch_size=expected_batch_size
            )
            optim.attach_step_hook(
                self.accountant.get_optimizer_hook_fn(sample_rate=sample_rate)
            )
            optimizers[i] = optim
        # return
        return optimizers

    def training_step(self, batch, batch_idx):
        optimizers = self.optimizers()
        for optimizer_idx, optimizer in enumerate(optimizers):
            logger.debug(f"Optimizer idx: {optimizer_idx}")
            assert isinstance(optimizer, Optimizer)
            optimizer.zero_grad()
            loss = self.loss(batch)
            self.manual_backward(loss)
            optimizer.step()
        self.log("train/loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        # return loss

    def on_train_epoch_end(self):
        # Logging privacy spent: (epsilon, delta)
        epsilon, best_alpha = self.accountant.get_privacy_spent(delta=self.delta)
        self.log("epsilon", epsilon, on_epoch=True, prog_bar=True)
        print(f"\nepsilon = {epsilon}; best_alpha = {best_alpha}")

    def loss(self, batch):
        data, target = batch
        output = self.model(data)
        loss = F.cross_entropy(output, target)
        return loss


from torchvision.datasets import MNIST
import torchvision.transforms as tfs
from torch.utils.data import DataLoader

dataloader = DataLoader(
    dataset=MNIST(
        '.data',
        transform = tfs.Compose(
            [
                tfs.ToTensor(),
                tfs.Normalize((0.1307,), (0.3081,)),
            ]
        ),
    ),
    batch_size=8,
)
model = LitSampleConvNetClassifier()

trainer = pl.Trainer(
    logger=False,
    enable_checkpointing=False,
    max_steps=1,
    enable_model_summary=False,
    callbacks=[DPDebug()],
)
trainer.fit(model, datamodule=data)

However, if I switch to automatic optimization, it will crash. To switch to automatic optimization, do the following:

Set this in the model

self.automatic_optimization = False

Change the training_step

    def training_step(self, batch, batch_idx, opt_idx):
            logger.debug(f"Optimizer idx: {opt_idx}")
            assert isinstance(optimizer, Optimizer)
            optimizer.zero_grad()
            loss = self.loss(batch)
            self.log("train/loss", loss, on_step=False, on_epoch=True, prog_bar=True)
           return loss

Environment

PCollecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 12.6 (x86_64)
GCC version: Could not collect
Clang version: 14.0.0 (clang-1400.0.29.102)
CMake version: version 3.23.1
Libc version: N/A

Python version: 3.8.13 (default, Jul 11 2022, 17:57:22)  [Clang 13.1.6 (clang-1316.0.21.2.5)] (64-bit runtime)
Python platform: macOS-12.6-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] functorch==0.2.1
[pip3] mypy==0.961
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.3
[pip3] pytest-mypy==0.9.1
[pip3] pytorch-lightning==1.7.7
[pip3] torch==1.12.1
[pip3] torch-cluster==1.6.0
[pip3] torch-geometric==2.1.0.post1
[pip3] torch-scatter==2.0.9
[pip3] torch-sparse==0.6.15
[pip3] torch-spline-conv==1.2.1
[pip3] torch-tb-profiler==0.4.0
[pip3] torchmetrics==0.9.3
[pip3] torchquad==0.3.0
[pip3] torchtext==0.13.1
[pip3] torchtyping==0.1.4
[pip3] torchvision==0.13.1
@gianmarcoaversanoenx
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant