This is an implementation of the paper "On Concept-Based Explanations in Deep Neural Networks" https://arxiv.org/abs/1910.07969. This specific implementation applies the ConceptSHAP technique to BERT and other transformer-based language models via the Huggingface Transformers library. This implementation was developed by members of Machine Learning @ Berkeley for Intuit's Machine Learning Futures Group in Spring 2020.
git clone https://github.com/arnav-gudibande/intuit-project.git
pip3 install -r requirements.txt
data
data/imdb-dataloader.py
-- dataloader for the IMDB Movie Sentiment Dataset, contains options to format test/train datadata/20news-dataloder.py
-- dataloader for 20NewsGroups dataset
model
bert-20news.py
andbert-imdb.py
-- training scripts for huggingface bert language modelbert_inference.py
-- outputs embeddings generated from a trained transformer model for a target dataset
clustering
generateClusters.py
-- k-means clustering of output embeddings- Note: this was discarded from the intitial conceptSHAP paper, but can still be used to test classical unsupervised methods against conceptSHAP
conceptSHAP
conceptNet.py
-- trainable subclass that learns conceptstrain_eval.py
-- training script forconceptNet.py
interpretConcepts.py
-- post-training concept analysis and tensorboard plotting
- Download and format IMDB Dataset:
sh data/imdb-dataloader.sh
- Train BERT model on IMDB:
sh model/bert-imdb.sh
- Generate and save BERT embeddings:
sh model/bert-inference_imdb.sh
- Run ConceptSHAP:
sh conceptSHAP/train_eval_imdb.sh
- Download and format 20News:
sh data/20news-dataloader.sh
- Train BERT model on 20News:
python3 model/bert-20news.py
- Generate and save BERT embeddings:
sh model/bert-inference_20news.sh
- Run ConceptSHAP:
sh conceptSHAP/train_eval_20news.sh
tensorboard --logdir=runs --port=6006