Opacus does not work on provided example for GAN and is broken on GANs in general #523

gianmarcoaversanoenx · 2022-10-17T20:30:18Z

🐛 Bug

I tried running this script and an error was raised.

RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1, 1] because the unspecified dimension size -1 can be any value and is ambiguous

I actually have been trying using Opacus for other model that require multiple optimizers, using either plain PyTorch or Lightning, and have always failed so far. I can run Opacus with PyTorch Lightning (using a Callback) only when there is one optimizer.

Expected behavior

No error is raised.

An example with simple model using two optimizers

This will work because of manual optimization.

from loguru import logger
import os
import warnings

import pytorch_lightning as pl
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.optim.optimizer import Optimizer
import torchmetrics
from opacus import PrivacyEngine, GradSampleModule
from opacus.accountants import RDPAccountant
from opacus.data_loader import DPDataLoader
from opacus.optimizers import DPOptimizer
from opacus.lightning import DPLightningDataModule
from pl_bolts.datamodules import MNISTDataModule
from pytorch_lightning.utilities.cli import LightningCLI


class LitSampleConvNetClassifier(pl.LightningModule):
    def __init__(
        self,
        lr: float = 0.1,
        enable_dp: bool = True,
        delta: float = 1e-5,
        noise_multiplier: float = 1.0,
        max_grad_norm: float = 1.0,
    ):
        """A simple conv-net for classifying MNIST with differential privacy training
        Args:
            lr: Learning rate
            enable_dp: Enables training with privacy guarantees using Opacus (if True), vanilla SGD otherwise
            delta: Target delta for which (eps, delta)-DP is computed
            noise_multiplier: Noise multiplier
            max_grad_norm: Clip per-sample gradients to this norm
        """
        super().__init__()
        # Hyper-parameters
        self.lr = lr
        # Parameters
        self.model = torch.nn.Sequential(
            torch.nn.Conv2d(1, 16, 8, 2, padding=3),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(2, 1), 
            torch.nn.Conv2d(16, 32, 4, 2), 
            torch.nn.ReLU(), 
            torch.nn.MaxPool2d(2, 1), 
            torch.nn.Flatten(), 
            torch.nn.Linear(32 * 4 * 4, 32), 
            torch.nn.ReLU(), 
            torch.nn.Linear(32, 10)
        )
        # Metrics
        self.test_accuracy = torchmetrics.Accuracy()
        # Differential privacy
        self.accountant = RDPAccountant()
        self.enable_dp = enable_dp
        self.delta = delta
        self.noise_multiplier = noise_multiplier
        self.max_grad_norm = max_grad_norm
        # Important: This property activates manual optimization.
        self.automatic_optimization = False

    def forward(self, x):
        return self.model(x)

    def configure_optimizers(self):
        print("Configuring optimizers...")
        parameters = list(self.model.parameters())
        params1 = parameters[:3]
        params2 = parameters[3:]
        optimizers = [
            torch.optim.SGD(params1, lr=0.05),
            torch.optim.SGD(params2, lr=0.05),
        ]
        # privacy
        if not isinstance(self.model, GradSampleModule):
            self.model = GradSampleModule(self.model)
        data_loader = self.trainer._data_connector._train_dataloader_source.dataloader()
        sample_rate: float = 1 / len(data_loader)
        dataset_size: int = len(data_loader.dataset)  # type: ignore
        expected_batch_size = int(dataset_size * sample_rate)
        for i, optim in enumerate(optimizers):
            optim = DPOptimizer(
                optimizer=optim,
                noise_multiplier=self.noise_multiplier,
                max_grad_norm=1.0,
                expected_batch_size=expected_batch_size
            )
            optim.attach_step_hook(
                self.accountant.get_optimizer_hook_fn(sample_rate=sample_rate)
            )
            optimizers[i] = optim
        # return
        return optimizers

    def training_step(self, batch, batch_idx):
        optimizers = self.optimizers()
        for optimizer_idx, optimizer in enumerate(optimizers):
            logger.debug(f"Optimizer idx: {optimizer_idx}")
            assert isinstance(optimizer, Optimizer)
            optimizer.zero_grad()
            loss = self.loss(batch)
            self.manual_backward(loss)
            optimizer.step()
        self.log("train/loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        # return loss

    def on_train_epoch_end(self):
        # Logging privacy spent: (epsilon, delta)
        epsilon, best_alpha = self.accountant.get_privacy_spent(delta=self.delta)
        self.log("epsilon", epsilon, on_epoch=True, prog_bar=True)
        print(f"\nepsilon = {epsilon}; best_alpha = {best_alpha}")

    def loss(self, batch):
        data, target = batch
        output = self.model(data)
        loss = F.cross_entropy(output, target)
        return loss


from torchvision.datasets import MNIST
import torchvision.transforms as tfs
from torch.utils.data import DataLoader

dataloader = DataLoader(
    dataset=MNIST(
        '.data',
        transform = tfs.Compose(
            [
                tfs.ToTensor(),
                tfs.Normalize((0.1307,), (0.3081,)),
            ]
        ),
    ),
    batch_size=8,
)
model = LitSampleConvNetClassifier()

trainer = pl.Trainer(
    logger=False,
    enable_checkpointing=False,
    max_steps=1,
    enable_model_summary=False,
    callbacks=[DPDebug()],
)
trainer.fit(model, datamodule=data)

However, if I switch to automatic optimization, it will crash. To switch to automatic optimization, do the following:

Set this in the model

self.automatic_optimization = False

Change the training_step

    def training_step(self, batch, batch_idx, opt_idx):
            logger.debug(f"Optimizer idx: {opt_idx}")
            assert isinstance(optimizer, Optimizer)
            optimizer.zero_grad()
            loss = self.loss(batch)
            self.log("train/loss", loss, on_step=False, on_epoch=True, prog_bar=True)
           return loss

Environment

PCollecting environment information...
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 12.6 (x86_64)
GCC version: Could not collect
Clang version: 14.0.0 (clang-1400.0.29.102)
CMake version: version 3.23.1
Libc version: N/A

Python version: 3.8.13 (default, Jul 11 2022, 17:57:22)  [Clang 13.1.6 (clang-1316.0.21.2.5)] (64-bit runtime)
Python platform: macOS-12.6-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] functorch==0.2.1
[pip3] mypy==0.961
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.3
[pip3] pytest-mypy==0.9.1
[pip3] pytorch-lightning==1.7.7
[pip3] torch==1.12.1
[pip3] torch-cluster==1.6.0
[pip3] torch-geometric==2.1.0.post1
[pip3] torch-scatter==2.0.9
[pip3] torch-sparse==0.6.15
[pip3] torch-spline-conv==1.2.1
[pip3] torch-tb-profiler==0.4.0
[pip3] torchmetrics==0.9.3
[pip3] torchquad==0.3.0
[pip3] torchtext==0.13.1
[pip3] torchtyping==0.1.4
[pip3] torchvision==0.13.1

The text was updated successfully, but these errors were encountered:

gianmarcoaversanoenx · 2022-10-18T20:40:25Z

The full notebook: https://github.com/gianmarcoaversanoenx/opacus-lightning/blob/main/opacus_lightning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opacus does not work on provided example for GAN and is broken on GANs in general #523

Opacus does not work on provided example for GAN and is broken on GANs in general #523

gianmarcoaversanoenx commented Oct 17, 2022 •

edited

Loading

gianmarcoaversanoenx commented Oct 18, 2022

Opacus does not work on provided example for GAN and is broken on GANs in general #523

Opacus does not work on provided example for GAN and is broken on GANs in general #523

Comments

gianmarcoaversanoenx commented Oct 17, 2022 • edited Loading

🐛 Bug

Expected behavior

An example with simple model using two optimizers

Environment

gianmarcoaversanoenx commented Oct 18, 2022

gianmarcoaversanoenx commented Oct 17, 2022 •

edited

Loading