This example builds an attentional seq2seq model for machine translation.
Two example datasets are provided:
- toy_copy: A small toy autoencoding dataset from TF Seq2seq toolkit.
- iwslt14: The benchmark IWSLT2014 (de-en) machine translation dataset, following (Ranzato et al., 2015) for data pre-processing.
Download the data with the following cmds:
python prepare_data.py --data toy_copy
python prepare_data.py --data iwslt14
Train the model with the following cmd:
python seq2seq_attn.py --config_model config_model --config_data config_toy_copy
Here:
--config_model
specifies the model config. Note not to include the.py
suffix.--config_data
specifies the data config.
config_model.py specifies a single-layer seq2seq model with Luong attention and bi-directional RNN encoder. Hyperparameters taking default values can be omitted from the config file.
For demonstration purpose, config_model_full.py gives all possible hyperparameters for the model. The two config files will lead to the same model.
On the IWSLT14 dataset, using original target texts as reference(no <UNK>
in the reference), the model achieves BLEU = 26.44 ± 0.18
.