The value of BLEU is always 0 when training the expert model #14

Zachary-YL · 2019-05-26T15:05:46Z

Hello, when I use Chinese (word segmentation) and English (token) parallel corpus to train the expert model, the value of BLEU is always 0. And the outputs of dev are all unk.
Like this:

step 6100 lr 0.00168035 step-time 0.08s exp-los 34.4890 gN 243.20 BLEU 0.00, Sat May 25 18:23:57 2019
step 6200 lr 0.00166674 step-time 0.08s exp-los 35.5610 gN 256.42 BLEU 0.00, Sat May 25 18:24:05 2019
step 6300 lr 0.00165346 step-time 0.08s exp-los 35.6490 gN 258.22 BLEU 0.00, Sat May 25 18:24:13 2019
step 6400 lr 0.0016405 step-time 0.08s exp-los 35.6305 gN 258.66 BLEU 0.00, Sat May 25 18:24:21 2019
step 6500 lr 0.00162783 step-time 0.08s exp-los 37.1311 gN 274.27 BLEU 0.00, Sat May 25 18:24:29 2019
step 6600 lr 0.00161545 step-time 0.08s exp-los 35.8512 gN 258.59 BLEU 0.00, Sat May 25 18:24:37 2019
step 6700 lr 0.00160335 step-time 0.08s exp-los 35.5759 gN 268.07 BLEU 0.00, Sat May 25 18:24:45 2019
step 6800 lr 0.00159152 step-time 0.08s exp-los 35.2491 gN 271.55 BLEU 0.00, Sat May 25 18:24:53 2019
step 6900 lr 0.00157995 step-time 0.08s exp-los 37.3241 gN 288.19 BLEU 0.00, Sat May 25 18:25:02 2019

Do you know why there is such a problem?Thanks!

Zachary-YL · 2019-05-30T16:36:57Z

Exactly not unk. If I set the last word in the vocabulary is 'Again', every time the output is the last word 'Again'.

Like this:
The every line output of the file in saved_exp_model/output_dev and saved_exp_model/output_test:

b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again'

b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again'

b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again'

b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again'

b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again' b'Again'

lovecambi · 2019-06-12T01:23:05Z

Can you double check your source and target files? For examples, they should not be ids, but be strings.

maohbao · 2020-07-17T03:09:46Z

Hi Zachary,

Do you settle the problem? I get the same result as you, the value of BLEU is always 0, and the outputs of dev and test are like b'Again' b'Again' b'Again'......

maohbao · 2020-07-17T03:11:13Z

Hello, when I use Chinese (word segmentation) and English (token) parallel corpus to train the expert model, the value of BLEU is always 0. And the outputs of dev are all unk.
Like this:

step 6100 lr 0.00168035 step-time 0.08s exp-los 34.4890 gN 243.20 BLEU 0.00, Sat May 25 18:23:57 2019
step 6200 lr 0.00166674 step-time 0.08s exp-los 35.5610 gN 256.42 BLEU 0.00, Sat May 25 18:24:05 2019
step 6300 lr 0.00165346 step-time 0.08s exp-los 35.6490 gN 258.22 BLEU 0.00, Sat May 25 18:24:13 2019
step 6400 lr 0.0016405 step-time 0.08s exp-los 35.6305 gN 258.66 BLEU 0.00, Sat May 25 18:24:21 2019
step 6500 lr 0.00162783 step-time 0.08s exp-los 37.1311 gN 274.27 BLEU 0.00, Sat May 25 18:24:29 2019
step 6600 lr 0.00161545 step-time 0.08s exp-los 35.8512 gN 258.59 BLEU 0.00, Sat May 25 18:24:37 2019
step 6700 lr 0.00160335 step-time 0.08s exp-los 35.5759 gN 268.07 BLEU 0.00, Sat May 25 18:24:45 2019
step 6800 lr 0.00159152 step-time 0.08s exp-los 35.2491 gN 271.55 BLEU 0.00, Sat May 25 18:24:53 2019
step 6900 lr 0.00157995 step-time 0.08s exp-los 37.3241 gN 288.19 BLEU 0.00, Sat May 25 18:25:02 2019

Do you know why there is such a problem?Thanks!

Thank you very much if you can reply me!

yidaxing · 2020-08-04T08:44:54Z

There may be a problem in creating the vocabulary，and I use onmt-build-vocab --size max_vocab_size --save_vocab $TEXT/src-vocab.txt $TEXT/train.en to create vocabulary, and it worked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The value of BLEU is always 0 when training the expert model #14

The value of BLEU is always 0 when training the expert model #14

Zachary-YL commented May 26, 2019

Zachary-YL commented May 30, 2019

lovecambi commented Jun 12, 2019

maohbao commented Jul 17, 2020

maohbao commented Jul 17, 2020

yidaxing commented Aug 4, 2020

The value of BLEU is always 0 when training the expert model #14

The value of BLEU is always 0 when training the expert model #14

Comments

Zachary-YL commented May 26, 2019

Zachary-YL commented May 30, 2019

lovecambi commented Jun 12, 2019

maohbao commented Jul 17, 2020

maohbao commented Jul 17, 2020

yidaxing commented Aug 4, 2020