Domain-agnostic Question-Answering with Adversarial Training

Code for our paper "Domain-agnostic Question-Answering with Adversarial Training" which is accepted by EMNLP-IJCNLP 2019 MRQA Workshop.

Model Architecture

Data Preparation

Download the original data

Download the data by running shell file.
Then run the code. Preprocessed train data will be created before training (It will takes quite a long time)

$ cd data
$ ./download_data.sh

(Optional) Download the pickled data (for fast data loading)

Download the pickled data from this link.
Unzip the zipfile on the root directory.

.
├── ...
├── pickled_data_bert-base-uncased_False
│   ├── HotpotQA.pkl
│   ├── NaturalQuestions.pkl
│   ├── NewsQA.pkl
│   ├── SQuAD.pkl
│   ├── SearchQA.pkl
│   └── TriviaQA.pkl
└── ...

Arguments should be same as below if you use pickled data. If you want to change one of these two arguments.

parser.add_argument("--bert_model", default="bert-base-uncased", type=str, help="Bert model")
parser.add_argument("--skip_no_ans", action='store_true', default=False, help="whether to exclude no answer example")

Requirements

Please install the following library requirements specified in the requirements.txt first.

torch==1.1.0
pytorch-pretrained-bert>=0.6.2
json-lines>=0.5.0

Model Training & Validation

$ python3 main.py \
         --epochs 2 \
         --batch_size 64 \
         --lr 3e-5 \
         --do_lower_case \
         --use_cuda \
         --do_valid \
         --adv \
         --dis_lambda 0.01

If you are using uncased bert model, give the option --do_lower_case.
If you want to do validation, give the option --do_valid.

Reference

@inproceedings{lee-etal-2019-domain,
    title={Domain-agnostic Question-Answering with Adversarial Training},
    author={Seanie Lee and Donggyu Kim and Jangwon Park},
    booktitle={Proceedings of the 2nd Workshop on Machine Reading for Question Answering},
    publisher={Association for Computational Linguistics},
    year={2019},
    url={https://www.aclweb.org/anthology/D19-5826},
}

Contributors

Lee, Seanie (https://seanie12.github.io/)
Kim, Donggyu (https://github.com/donggyukimc)
Park, Jangwon (https://github.com/monologg)

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
distributed_run.py		distributed_run.py
eval.py		eval.py
iterator.py		iterator.py
main.py		main.py
model.py		model.py
mrqa_official_eval.py		mrqa_official_eval.py
requirements.txt		requirements.txt
trainer.py		trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Domain-agnostic Question-Answering with Adversarial Training

Model Architecture

Data Preparation

Download the original data

(Optional) Download the pickled data (for fast data loading)

Requirements

Model Training & Validation

Reference

Contributors

About

Packages

Contributors 3

Languages

License

seanie12/mrqa

Folders and files

Latest commit

History

Repository files navigation

Domain-agnostic Question-Answering with Adversarial Training

Model Architecture

Data Preparation

Download the original data

(Optional) Download the pickled data (for fast data loading)

Requirements

Model Training & Validation

Reference

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Contributors 3

Languages

Packages