Supplementary code for "Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning"
This code comes jointly with reference:
Andrei Semenov, Philip Zmushko, Alexander Pichugin, Aleksandr Beznosikov. "Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning".
Date: August 2024
git clone https://github.com/Andron00e/JAST
cd JAST
pip install -r requirements.txt
- unsplit contains examples of defense against the Model Inversion attack on MLPs, and differentially private protection for both CNN and MLP architectures.
- In fsha , you may find the code for Hijacking attack on models with and without the dense layer and make sure of the data protection.
- In the results folder, we store the necessary figures from the experiments.
- Please, run
FSHA.ipynb
,model_inversion_stealing.ipynb
,dp_defense.ipynb
andmlp_mixer_model_inversion.ipynb
to observe the defense results at your convenience.
In our code, we consider the defense against the Model Inversion attack from "UnSplit" (Erdogan et al., 2022) (code) and Feature-Space-Hijacking attack (FSHA) from "Unleashing the Tiger" (Pasquini et al., 2021) (code).
In both cases, the necessary hyperparameters are required, we list them below:
-
Common arguments for Split Learning protocol are:
batch_size
,split_layer
,dataset_name
,device
,n_epochs
,architecture
. -
We conduct all experiments on
mnist
,f_mnist
andcifar10
datasets, for this purposes assign the proper name todataset_name
. And the main hyperparameter for validating our results issplit_layer
, feel free to set its number from1
to6
. -
Set the
architecture
to eithermlp
,cnn
ormlp-mixer
. In case ofcnn
you will see the original performance of UnSplit and FSHA (except of the DP setup). Below, we describe the changes for the two mentioned settings. -
fsha folder:
- For FSHA we use some special hyperparameters:
WGAN
,gradient_penalty
,style_loss
,lr_f
,lr_tilde
,lr_D
. These hyperparameters refer to the training of the encoder, decoder, and discriminator networks; we took them from the original implementation (see code) and did not change them in our work. - The changes occur in
architectures.py
, where we introducepilot_mlp
,discriminator_mlp
and left the samepilot_cnn
withdiscriminator_cnn
. Set thearchitecture
value tomlp
to observe the FSHA on mlp-based model, while the core architecture incnn
case isresnet
.
- For FSHA we use some special hyperparameters:
-
unsplit folder:
- For UnSplit we mention other special hyperparameters:
lr
,main_iters
,input_iters
,model_iters
,lambda_l2
,lambda_tv
. We suggest configuring them as laid out in our work (0.001
,200
,20
,20
,0.1
,1.
) for efficient reproduction of the results. We also stress thatlambda_l2
regularizer was not mentioned in the original UnSplit paper's model inversion attack algorithm. We also validate the performance of the UnSplit attack on CIFAR10. Int his setup, we decided to use the MLP-Mixer Tolstikhin et al., 2021 architecture, following the PyTorch implementation. In this case, the hypermarameters values are increased, instead of then_epochs=50
for CNN-based models, we trained MLPMixer from scratch forn_epochs=50
. In addition, we use aGradualWarmupScheduler
. - When it comes to the DP setting, we use the same training hyperparameters as those used in defense with MLPs against UnSplit. The difference lies in the code for adding noise to the dataloader. The key hyperparameters, in this case, are:
epsilon
anddelta
for the global$\ell_2$ sensitivity. We usecalibrateAnalyticGaussianMechanism
from Borja Balle et al., 2018 code to calculatesigma
for each of the mentioned datasets. For achieving a proper utility-privacy trade-off, we suggest pickingepsilon=6
,delta=0.5
,n_epochs=20
formnist
andf_mnist
(so the value of$\sigma$ equals to1.6
and2.6
, respectively). - We also conducted an experiments on the DP defense for the CIFAR10 dataset, which we report in Table 3. For these experiments, please refer to the
additional_dp_experiments.ipynb
. We usedn_epochs=50
andepsilon
,delta
that result insigma=0.25
for CIFAR10.
- For UnSplit we mention other special hyperparameters:
We believe the details provided are clear enough to reproduce the main findings of our paper.
@misc{semenov2024justsimpletransformationdata,
title={Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning},
author={Andrei Semenov and Philip Zmushko and Alexander Pichugin and Aleksandr Beznosikov},
year={2024},
eprint={2412.11689},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2412.11689},
}