Filed to run GNMT #5

skyw · 2017-07-13T17:13:32Z

It complains a key error
"KeyError: num_residual_layers"

Here is my script

python -m nmt.nmt
--src=en --tgt=de
--vocab_prefix=${DATA_DIR}/vocab
--train_prefix=${DATA_DIR}/train
--dev_prefix=${DATA_DIR}/newstest2014
--test_prefix=${DATA_DIR}/newstest2015
--out_dir=$(OUT_DIR}/test
--hparams_path nmt/standard_hparams/wmt16_en_de_gnmt.json

oahziur · 2017-07-13T17:26:22Z

Thanks, I will need to update the nmt/standard_hparams/wmt16_en_de_gnmt.json.

oahziur · 2017-07-13T17:27:20Z

I am also adding instructions on how to train and load the gnmt model from scratch.

vince62s · 2017-07-21T06:45:15Z

I ran a standard attention / scaled_luong / uni system and go the expected results.
Same with gnmt architecture / scaled_luong / enc_type gnmt, completely off.
Is there something special to do for GNMT attention architecture ?

oahziur · 2017-07-21T12:47:15Z

@vince62s Did you check with the standard_hparams for GNMT, there are also pre-trained models available for download in the README page.

ndvbd · 2018-01-02T06:38:15Z

Same problem here. After training, when doing inference I get:

KeyError: 'num_encoder_residual_layers'

It only works when I delete all these keys from the hparams file, and when I set the --hparams_path to the directory of the best_bleu, but then after one run, for some reason, it rewrites the hparams file, and add these problematic key/values again... It's not clear how this mechanism works.

My guess is that when the code is saving hparams, it simply writes key values that it doesn't suppose to.

oahziur · 2018-01-02T17:23:36Z

@NadavB can you share the command getting the error? were you using the standard_hparams file in the repo for inference?

There are some updates to the hparams recently, so I think the standard_hparams maybe out of date.

ndvbd · 2018-01-04T12:47:31Z

@oahziur I did not use the standard hparams. I used the params as shown in the tutorial.

So for training:

--attention=scaled_luong \
    --src=vi --tgt=en \
    --vocab_prefix=tmp/nmt_data/vocab  \
    --train_prefix=tmp/nmt_data/train \
    --dev_prefix=tmp/nmt_data/tst2012  \
    --test_prefix=tmp/nmt_data/tst2013 \
    --out_dir=/tmp/nmt_attention_model \
    --num_train_steps=5000 \
    --steps_per_stats=20 \
    --num_layers=2 \
    --num_units=128 \
    --dropout=0.2 \
    --metrics=bleu

And for inference:

python nmt/nmt.py \
    --out_dir=/tmp/nmt_attention_model \
    --inference_input_file=/tmp/nmt_data/source_infer.vi \
    --inference_output_file=/tmp/nmt_attention_model/output_infer

LimWoohyun · 2018-01-09T08:03:09Z

@NadavB Hello. i"m studying nmt.
i want to run test file. so i ran just. nmt,py but failed.

how to command your script?? please let me know basiclly

ndvbd · 2018-01-10T17:06:51Z

@LimWoohyun Look at https://github.com/tensorflow/nmt -> search for "Hands-on – building an attention-based NMT model" the command is written there.

bquast · 2018-02-22T07:48:36Z

@oahziur I get Key error using the standard_hparams (tf 1.6rc1, will try on my other machine with tf1.5-cuda).

NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

using the command:

[bquast@UX370UA ~]$ cd nmt
[bquast@UX370UA nmt]$ python -m nmt.nmt \
>     --src=de --tgt=en \
>     --ckpt=deen_gnmt_model_4_layer/translate.ckpt \
>     --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
>     --out_dir=/tmp/deen_gnmt \
>     --vocab_prefix=/home/bquast/en_de_data/vocab.bpe.32000 \
>     --inference_input_file=/home/bquast/en_de_data/newstest2014.tok.bpe.32000.de \
>     --inference_output_file=/home/bquast/deen_gnmt_model_4_layer/output_infer \

full output here:

https://gist.github.com/bquast/30ba7630d2bf32b59dd8349889fc7638

EDIT: confirmed, same error on tf15.-cuda

https://gist.github.com/bquast/0ddbf8eda363d312dd57b51aebb11f5d

tiberiu92 · 2018-03-08T12:11:54Z

@bquast I recently got the error using the same configuration.

Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

I tried this with tf14 too, but no luck. Are there any updates on this?

Thank you.

bquast · 2018-03-09T17:25:12Z

hey, no news yet, any progress on your side?

oahziur · 2018-03-10T00:44:13Z

@bquast I think this is related to #264 and there is a PR fixed this #265. Maybe you can try patch the PR and see if you still get the issue. Make sure you clear the model directory.

xiaohaoliang · 2018-07-03T09:48:28Z

@bquast @tiberiu92 @oahziur
I got the same error using the same configuration. （tf-1.8, python-2.7）

python -m nmt.nmt \
    --src=de --tgt=en \
    --ckpt=/home/xiaohao/nmt/models/deen_gnmt_model_4_layer/translate.ckpt \
    --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
    --out_dir=/home/xiaohao/data/deen_gnmt \
    --vocab_prefix=/home/xiaohao/data/wmt16/vocab.bpe.32000 \
    --inference_input_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.de \
    --inference_output_file=/home/xiaohao/data/deen_gnmt/output_infer \
    --inference_ref_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.en

NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint

I print keys of deen_gnmt_model_4_layer/translate.ckpt ,not find .../rnn/basic_lstm_cell/bias

xiaohao@ubuntu:~/nmt$ python ckpt_print.py models/deen_gnmt_model_4_layer/translate.ckpt
('CHECKPOINT_FILE: ', 'models/deen_gnmt_model_4_layer/translate.ckpt')
('tensor_name: ', 'embeddings/encoder/embedding_encoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/memory_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/output_projection/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/query_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_v')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_b')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_g')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'Variable')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/bias')
('tensor_name: ', 'embeddings/decoder/embedding_decoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
xiaohao@ubuntu:~/nmt$

I try the PR(#265), and rm -rf /home/xiaohao/data/deen_gnmt/* . The problem is sloved！

tks~ @oahziur

oahziur self-assigned this Jul 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filed to run GNMT #5

Filed to run GNMT #5

skyw commented Jul 13, 2017

oahziur commented Jul 13, 2017

oahziur commented Jul 13, 2017

vince62s commented Jul 21, 2017

oahziur commented Jul 21, 2017

ndvbd commented Jan 2, 2018 •

edited

Loading

oahziur commented Jan 2, 2018

ndvbd commented Jan 4, 2018

LimWoohyun commented Jan 9, 2018

ndvbd commented Jan 10, 2018 •

edited

Loading

bquast commented Feb 22, 2018 •

edited

Loading

tiberiu92 commented Mar 8, 2018 •

edited

Loading

bquast commented Mar 9, 2018

oahziur commented Mar 10, 2018

xiaohaoliang commented Jul 3, 2018 •

edited

Loading

Filed to run GNMT #5

Filed to run GNMT #5

Comments

skyw commented Jul 13, 2017

oahziur commented Jul 13, 2017

oahziur commented Jul 13, 2017

vince62s commented Jul 21, 2017

oahziur commented Jul 21, 2017

ndvbd commented Jan 2, 2018 • edited Loading

oahziur commented Jan 2, 2018

ndvbd commented Jan 4, 2018

LimWoohyun commented Jan 9, 2018

ndvbd commented Jan 10, 2018 • edited Loading

bquast commented Feb 22, 2018 • edited Loading

tiberiu92 commented Mar 8, 2018 • edited Loading

bquast commented Mar 9, 2018

oahziur commented Mar 10, 2018

xiaohaoliang commented Jul 3, 2018 • edited Loading

ndvbd commented Jan 2, 2018 •

edited

Loading

ndvbd commented Jan 10, 2018 •

edited

Loading

bquast commented Feb 22, 2018 •

edited

Loading

tiberiu92 commented Mar 8, 2018 •

edited

Loading

xiaohaoliang commented Jul 3, 2018 •

edited

Loading