This is a seq2seq Question Generation model referred on this repository for implementation of basic data interface and evaluation. However, the model was modified so that it can integrate extern information from Knowledge Graph to assist decoding, and we have got better test results.
To train or test our model, you should install the following Python Packages:
- python >= 3.7
- pytorch >= 1.5
- nltk(nltk_data files are also needed)
- tqdm
- pytorch_scatter
Data of Knowledge Graph has been already processed by us, the original KG data is included in ./data/resource.json
Due to the corpus size, we can not provide SQuAD data on the Github, but you can download the corpus as followed:
mkdir squad
wget http://nlp.stanford.edu/data/glove.840B.300d.zip -O ./data/glove.840B.300d.zip
unzip ./data/glove.840B.300d.zip
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json -O ./squad/train-v1.1.json
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -O ./squad/dev-v1.1.json
cd data
python process_data.py
You might need to change model configuration in ./config.py.
If you want to train with your gpu, please set the gpu device in config.py
Other model configurations and hyper-parameters can also be customized
To train the model, you can use the following commandlines:
python main.py --train (--model_path=<your_model_savepoint_path>)
The parameter --model_path
is optional, if you want to train from scratch, then use python main.py --train
Once you model gets the best development set result of current training process, the model parameters will be saved in ./save/train_<timestamp>/<epoch_number>_<dev_loss>
To test the model, you can use the following commandlines:
python main.py --model_path=<your_model_paras_path> --output_file=<output_file_path>
Evaluation from this repository
cd qgevalcap
python2 eval.py --out_file <prediction_file> --src_file <src_file> --tgt_file <target_file>
BLEU_1 | BLEU_2 | BLEU_3 | BLEU_4 |
---|---|---|---|
46.30 | 30.85 | 22.76 | 17.47 |
- Improving Neural Story Generation by Targeted Common Sense Grounding
- Commonsense Knowledge Aware Conversation Generation with Graph Attention
- Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs
- Answer-focused and Position-aware Neural Question Generation
- Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks
- Identifying Where to Focus in Reading Comprehension for Neural Question Generation