This repository contains the implementation of D-ADD, a detector for defence against model stealing attack.
Here is an example for for WRN16_4 and CIFAR10
To train a model:
python train_model.py --model wrn16_4 --dataset cifar10 --epoch 50 --batch_size 256 --lr 0.1 --save_path pretrain/wrn16_4_cifar100.pth --data_path ../datasets --device cuda
Calculate the mean and covariance of the sample features of the training set:
python get_mean_cov.py --model wrn16_4 --dataset cifar10 --num_classes 10 --data_path ../datasets --model_path pretrain --device cuda
Calculate the distance of dataset and target dataset, test benign accuracy:
python get_distance.py --model wrn16_4 --dataset cifar100 --train_set False --target_dataset cifar10 --num_classes 10 --defense True --threshold 11.6 --window_size 16 --data_path ../datasets --device cuda
python get_distance.py --model wrn16_4 --dataset stl10 --data_filter '[1, 3, 7]' --target_dataset cifar10 --num_classes 10 --defense True --threshold 11.6 --window_size 16 --data_path ../datasets --device cuda
Run knockoff
python knockoff.py --victim_model wrn16_4 --surrogate_model wrn16_4 --target_dataset cifar10 --surrogate_dataset cifar100 --train_set True --num_classes 10 --window_size 16 --threshold 11.6 --lr 0.1 --e 30 --data_path ../datasets --device cuda