You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 26, 2022. It is now read-only.
Hi, I am trying to train an English-Finnish translation engine with a data set in the IT domain (about 800,000 unique sentence pairs, 13 million English words), using 32000 joint BPE operations (vocabulary is 13,500 for English and 22,700 for Finnish). The validation set (2000 sentence pairs) is randomly extracted from the training data (and removed from it).
Using the fconv model, the training completes nicely. The parameters are
-model fconv -nenclayer 10 -nlayer 8 -dropout 0.2 -optim nag -lr 0.25 -clip 0.1 -batchsize 32 -maxbatch 3200 \
-momentum 0.99 -timeavg -bptt 0 -nembed 512 -noutembed 512 -nhid 512
The training ends up with these values:
| checkpoint 018 | epoch 018 | 1004778 updates | s/checkpnt 5190 | words/s 4001 | lr 0.000025 | avg_dict_size 8692.39
| checkpoint 018 | epoch 018 | 1004778 updates | trainloss 1.09 | train ppl 2.13
| checkpoint 018 | epoch 018 | 1004778 updates | validloss 1.43 | valid ppl 2.69 | testloss 3.06 | test ppl 8.32
With the blstm model, I haven't been able to do a proper training. With the parameters suggested in the README the training ends after 2 epochs and a validation set perplexity of 99614929. I have tried different algorithms and learning rates, different number of layers, and in all cases the validation ppl is huge and BLEU scores very low. The lowest validation set ppl at first epoch (6500, but it then increases) is with the following parameters:
-model blstm -dropout 0.3 -optim sgd -lr 0.25 -clip 25 -bptt 25 -nembed 512 -noutembed 512 -nhid 512
Any idea of what could be happening or suggestions? Thanks.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi, I am trying to train an English-Finnish translation engine with a data set in the IT domain (about 800,000 unique sentence pairs, 13 million English words), using 32000 joint BPE operations (vocabulary is 13,500 for English and 22,700 for Finnish). The validation set (2000 sentence pairs) is randomly extracted from the training data (and removed from it).
Using the fconv model, the training completes nicely. The parameters are
-model fconv -nenclayer 10 -nlayer 8 -dropout 0.2 -optim nag -lr 0.25 -clip 0.1 -batchsize 32 -maxbatch 3200 \
-momentum 0.99 -timeavg -bptt 0 -nembed 512 -noutembed 512 -nhid 512
The training ends up with these values:
| checkpoint 018 | epoch 018 | 1004778 updates | s/checkpnt 5190 | words/s 4001 | lr 0.000025 | avg_dict_size 8692.39
| checkpoint 018 | epoch 018 | 1004778 updates | trainloss 1.09 | train ppl 2.13
| checkpoint 018 | epoch 018 | 1004778 updates | validloss 1.43 | valid ppl 2.69 | testloss 3.06 | test ppl 8.32
With the blstm model, I haven't been able to do a proper training. With the parameters suggested in the README the training ends after 2 epochs and a validation set perplexity of 99614929. I have tried different algorithms and learning rates, different number of layers, and in all cases the validation ppl is huge and BLEU scores very low. The lowest validation set ppl at first epoch (6500, but it then increases) is with the following parameters:
-model blstm -dropout 0.3 -optim sgd -lr 0.25 -clip 25 -bptt 25 -nembed 512 -noutembed 512 -nhid 512
Any idea of what could be happening or suggestions? Thanks.
The text was updated successfully, but these errors were encountered: