Skip to content

garvin053/CSAW-HackML-2020

 
 

Repository files navigation

CSAW-HackML-2020

Group Member Names and IDs: Youwen Zhang (yz6999), Run Yu (ry2068)

submit files:

models: /RepairedNet/ #contain the repaired models

code: /project.ipynb

code pdf: /project.pdf

project report: /project_report.pdf

Project Summary

Considering the bad net would not be just a simple backdoor attack, but an adaptive attack, which means this attack embeds backdoor functionality in the same neurons that are activated by clean inputs. We’d like to use a fine-pruning defense strategy: we first prune unactivated neurons when using the validation clean data , and then training the new neurons using the validation data to increase the decreased predicted successful rate caused by also pruned useful neurons.

Project details

  1. DependenciesWe run our code in colab, GPU is not required but recommended, all data and library download can be done in colab website automatically

  2. Download the required model and data We get data and bad models from google drive and git provided by CSAW-HackML-2020, required bad models are: sunglasses_bd_net.h5, anonymous_1_bd_net.h5, multi_trigger_multi_target_bd_net.h5. Download and load clean validation data to do fine-pruning and poisoned data to later test our repaired good models.

  3. Evaluate on bad net by poisoned data and clean data We’d like to first get the prediction accuracy rate by clean test data , and the attack success rate(we call it ‘asr’ in later articles) by poisoned data. From the previous output graph , we can see that even though the bad model can achieve a good prediction accuracy rate using clean data, but because of the backdoor in this bad model, the asr is also very high. The attackers can attack these backdoor models easily by using the poisoned images. Next , we will use our way to eliminate the bad guys' backdoor in these models!

image

  1. Fine-Pruning and evaluate the repaired model We use library tensorflow_model_optimization.sparsity to help us do this pruning job and training after pruning. document reference: https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/sparsity/keras. The method prune_low_magnitude in this library can help to do the pruning job by pruning the lowest magnitude(less activated) neurons in this model at first, and we train this model 3 epoches to get a good model. After pruning, we reevaluate the repaired model, image image From the previous graph and output, we can see that even though we lost some accuracy on clean data , we decrease the asr tremendously!

  2. save repaired model Use model.save and model.save_weights to save the repaired model and weights, we upload the repaired models and weights into git. What’s more , it seems that even though we successfully decrease the asr, we don’t achieve the requirement that “output class N+1 if the input is backdoored”. We have some trouble outputting such model , but if required , users can first get the pruning model and then use pruning model and bad model to get such model that can output class N+1 if the input is backdoored, we add an example in the end of the colab file.

About

Starting Point for the CSAW HackML 2020 competition is here: https://www.csaw.io/hackml

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.3%
  • Python 5.7%