Word features in Translation #1534

vikrant97 · 2019-08-20T09:35:59Z

I am trying to use word level features based on the paper "Linguistic Input Features Improve Neural Machine Translation", but couldn't find any correct documentation on how to use it.
I want to use different feature vector sizes for different features.
Can anyone help me out?

vikrant97 · 2019-08-21T06:40:20Z

I am getting this error while training.
Traning command:
python train.py -data ../data/word_with_pos_only -save_model ../models/word_with_pos_only_emb12 -layers 6 -rnn_size 512 -word_vec_size 500 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 150000 -max_generator_batches 2 -dropout 0.3 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 1 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 5000 -save_checkpoint_steps 20000 -world_size 1 -gpu_ranks 0

example: महानगर￨NN पालिका￨NNPC अंतर्गत￨JJ दत्तात्रय￨NNPC नगर￨NNPC माध्यमिक￨NNPC स्कूल￨NN के￨PSP विद्यार्थियों￨NN ने￨PSP काल्पनिक￨JJ किला￨NN दत्तगढ़￨NNP बनाकर￨VM अपनी￨PRP कल्पनाशक्ति￨NN का￨PSP परिचय￨NN दिया￨VM

Traceback (most recent call last):
File "/home/vikrant.goyal/OpenNMT-py/train.py", line 109, in
main(opt)
File "/home/vikrant.goyal/OpenNMT-py/train.py", line 39, in main
single_main(opt, 0)
File "/home/vikrant.goyal/OpenNMT-py/onmt/train_single.py", line 127, in main
valid_steps=opt.valid_steps)
File "/home/vikrant.goyal/OpenNMT-py/onmt/trainer.py", line 249, in train
report_stats)
File "/home/vikrant.goyal/OpenNMT-py/onmt/trainer.py", line 364, in _gradient_accumulation
outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/vikrant.goyal/OpenNMT-py/onmt/models/model.py", line 46, in forward
memory_lengths=lengths)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/vikrant.goyal/OpenNMT-py/onmt/decoders/transformer.py", line 215, in forward
step=step)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/vikrant.goyal/OpenNMT-py/onmt/decoders/transformer.py", line 69, in forward
input_norm = self.layer_norm_1(inputs)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(input, **kwargs)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/normalization.py", line 157, in forward
input, self.normalized_shape, self.weight, self.bias, self.eps)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/functional.py", line 1725, in layer_norm
torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[512], expected input with shape [, 512], but got input of size[227, 16, 500]

Edit: It works if i use default options of train.py but not with the above mentioned command
python train.py -data ../data/word_with_pos_only -save_model ../models/word_with_pos_only -save_checkpoint_steps 20000 -world_size 1 -gpu_ranks 0

vikrant97 · 2019-08-22T07:54:37Z

I figured out that if I use rnn_size to be equal to the total word_vec_size (i.e including feature dimensions), then it works.
But the problem is I have to use different feature vec size for different features & the above contraint stop me doing that.
@vince62s Can you help me out in this?

vince62s · 2019-08-23T10:38:22Z

@vikrant97 your request is unclear to me.
each feature must have a constant vec size.
eg: if you have two features F1 and F2,then each example must have the same vec size for its F1 and same for F2. However F1 vec size can be different from F2 vec size.
are you asking something else?

vikrant97 · 2019-08-23T14:55:39Z

@vince62s Hi, I have two problems:-

When using transformer model, if I choose the word_vec_size to be different than rnn_size, then it throws error. For example, use rnn_size=512 & word_vec_size=500, it will throw some error saying that it expects the word vec dimension to be equal to 512.
There is no option to specify different vec size for two different features say F1 & F2. The toolkit has a parameter -feat_vec_size which is a "int" and it takes only 1 value & use this as default vec size for both the features say F1 & F2 with feat_merge operation being "concat". I am saying it should take a list of feature_vec_sizes for different features F1, F2, F3....

P.S I edited the code to incorporate a list of feature_vec_sizes for different features F1, F2, F3...
but it throws some error with a conflicting issue with rnn_size.

I hope my issue is clear? If yes, can you please help me out it in these two problems?

vince62s · 2019-08-23T19:22:18Z

ok, then what is the issue of setting for instance emb_size 500 feat_vec_size 6, hidden 512
and if you have 2 features, it should work fine
it is not a big deal to set the feat_vec_size to the highest of what you need, is it ?

vikrant97 · 2019-08-24T07:04:45Z

@vince62s Yes that works fine because the total vec_size will come out to be 512 only (Therefore no conflicting issues with rrn_size). But the actual problem is:
For example, if you want to use F1 (POS tag) which has a max vocab size of 50 & another feature F2 (say lemma) which has vocab size of 50k. Using the same embedding_size for both the features would not be appropriate. I would say an embedding_size of 10 would would be appropriate for F1 & F2 should have embeddings_size close to that of a word.

So I guess the trick you suggested will not work here or am I missing something?

vince62s · 2019-08-24T08:44:30Z

ok I see, what you are asking is in fact a duplicate of this:
#344

@bpopeters since you started to work onit, can you give him some pointers so that he can submit a PR ?
Thanks

eduamf · 2019-08-28T23:53:08Z

Hi Vikrant, The words’ features are back? De: Vikrant Goyal [mailto:[email protected]] Enviada em: quarta-feira, 21 de agosto de 2019 03:40 Para: OpenNMT/OpenNMT-py Cc: Subscribed Assunto: Re: [OpenNMT/OpenNMT-py] Word features in Translation (#1534) I am getting this error while training. Traning command: python train.py -data ../data/word_with_pos_only -save_model ../models/word_with_pos_only_emb12 -layers 6 -rnn_size 512 -word_vec_size 500 -transformer_ff 2048 -heads 8 -encoder_type transformer -decoder_type transformer -position_encoding -train_steps 150000 -max_generator_batches 2 -dropout 0.3 -batch_size 4096 -batch_type tokens -normalization tokens -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 1 -max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 -valid_steps 5000 -save_checkpoint_steps 20000 -world_size 1 -gpu_ranks 0 example: महानगर￨NN पालिका￨NNPC अंतर्गत￨JJ दत्तात्रय￨NNPC नगर￨NNPC माध्यमिक￨NNPC स्कूल￨NN के￨PSP विद्यार्थियों￨NN ने￨PSP काल्पनिक￨JJ किला￨NN दत्तगढ़￨NNP बनाकर￨VM अपनी￨PRP कल्पनाशक्ति￨NN का￨PSP परिचय￨NN दिया￨VM Traceback (most recent call last): File "/home/vikrant.goyal/OpenNMT-py/train.py", line 109, in main(opt) File "/home/vikrant.goyal/OpenNMT-py/train.py", line 39, in main single_main(opt, 0) File "/home/vikrant.goyal/OpenNMT-py/onmt/train_single.py", line 127, in main valid_steps=opt.valid_steps) File "/home/vikrant.goyal/OpenNMT-py/onmt/trainer.py", line 249, in train report_stats) File "/home/vikrant.goyal/OpenNMT-py/onmt/trainer.py", line 364, in _gradient_accumulation outputs, attns = self.model(src, tgt, src_lengths, bptt=bptt) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/vikrant.goyal/OpenNMT-py/onmt/models/model.py", line 46, in forward memory_lengths=lengths) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/vikrant.goyal/OpenNMT-py/onmt/decoders/transformer.py", line 215, in forward step=step) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/vikrant.goyal/OpenNMT-py/onmt/decoders/transformer.py", line 69, in forward input_norm = self.layer_norm_1(inputs) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, **kwargs) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/modules/normalization.py", line 157, in forward input, self.normalized_shape, self.weight, self.bias, self.eps) File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/nn/functional.py", line 1725, in layer_norm torch.backends.cudnn.enabled) RuntimeError: Given normalized_shape=[512], expected input with shape [, 512], but got input of size[227, 16, 500] — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#1534?email_source=notifications&email_token=AFTBYL2ZMTJQIOZPVBSGJFTQFTPNXA5CNFSM4INS4EN2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4YTRAQ#issuecomment-523319426>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTBYLZ3M6Q26BR26HMTXXDQFTPNXANCNFSM4INS4ENQ>.

vikrant97 · 2019-08-31T20:06:25Z

Hi @eduamf
The "word features" thing works if you want same feature vec size for all features.

vikrant97 · 2019-09-16T10:02:25Z

@vince62s I have updated (locally) the OpenNMT code to incorporate word features on the source side.
It trains without any errors but fails on testing. The error seems to occur in loading the trained model.
Can you please help me out with this error?

Traceback (most recent call last):
File "translate.py", line 48, in
main(opt)
File "translate.py", line 19, in main
translator = build_translator(opt, report_score=True)
File "/home/vikrant.goyal/OpenNMT-py/onmt/translate/translator.py", line 28, in build_translator
fields, model, model_opt = load_test_model(opt)
File "/home/vikrant.goyal/OpenNMT-py/onmt/model_builder.py", line 85, in load_test_model
map_location=lambda storage, loc: storage)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/vikrant.goyal/OpenNMT-py/myenv/lib/python3.5/site-packages/torch/serialization.py", line 564, in _load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input

vince62s · 2019-09-16T10:12:57Z

If you want to discuss some of your code, please open e [WIP] PR and then we cansse what is going right and wrong. Without knowing what you have done, difficult to help.

Most likely there could be an issue with your input features at inference.

vikrant97 · 2019-09-16T15:43:29Z

@vince62s The issue is solved & is working on my end.
Please check the PR that I have submitted.

After this I will also look into target side features if that works and also limiting the feature vocab sizes to some particular threshold (becoz that's needed for features like lemma).
Thanks!

Henry-E · 2019-11-05T10:31:05Z

Based on what's written here I have a model with 4 features. So I set feat vec size to be all the same. But I still get an error.

--rnn_size 512 \
--word_vec_size 384 \
--feat_vec_size 32 \

RuntimeError: Given normalized_shape=[512], expected input with shape [*, 512], but got input of size[81, 25, 384]

When I run with regular word vector size I also get the error

--rnn_size 512 \
--word_vec_size 512 \
--feat_vec_size 32 \

RuntimeError: Given normalized_shape=[512], expected input with shape [*, 512], but got input of size[39, 102, 640]

vince62s · 2020-05-09T17:20:51Z

@Henry-E Just trying to document a bit this, when you have one extra feature of size 12
src_word_vec_size = 500
feat_vec_size = 12
rnn_size = 512
tgt_word_vec_size = 512
share_embeddings= false

The 2 errors you reported are one from the encoder and one from the decoder.
This set up is not trivial, and we need to better document and catch misconfiguration before training.

once we take #1564 and #1710 it will be easier.

Henry-E · 2020-05-12T10:19:59Z

Thanks for following up. Are there any actions that you need from me to recreate this?

eihe · 2020-06-23T13:54:11Z

I'm trying to run source features with a transformer, and I'm getting this error:

[2020-06-23 14:48:45,332 INFO]  * src_feat_0 vocab size = 6
[2020-06-23 14:48:45,333 INFO]  * tgt vocab size = 50004
[2020-06-23 14:48:45,333 INFO] Building model...
Traceback (most recent call last):
  File "/home/walsha94/OpenNMT-py/train.py", line 6, in <module>
    main()
  File "/home/walsha94/OpenNMT-py/onmt/bin/train.py", line 209, in main
    train(opt)
  File "/home/walsha94/OpenNMT-py/onmt/bin/train.py", line 91, in train
    single_main(opt, 0)
  File "/home/walsha94/OpenNMT-py/onmt/train_single.py", line 87, in main
    model = build_model(model_opt, opt, fields, checkpoint)
  File "/home/walsha94/OpenNMT-py/onmt/model_builder.py", line 242, in build_model
    model = build_base_model(model_opt, fields, use_gpu(opt), checkpoint)
  File "/home/walsha94/OpenNMT-py/onmt/model_builder.py", line 144, in build_base_model
    src_emb = build_embeddings(model_opt, src_field)
  File "/home/walsha94/OpenNMT-py/onmt/model_builder.py", line 62, in build_embeddings
    fix_word_vecs=fix_word_vecs
  File "/home/walsha94/OpenNMT-py/onmt/modules/embeddings.py", line 199, in __init__
    pe = PositionalEncoding(dropout, self.embedding_size)
  File "/home/walsha94/OpenNMT-py/onmt/modules/embeddings.py", line 25, in __init__
    "odd dim (got dim={:d})".format(dim))
ValueError: Cannot use sin/cos positional encoding with odd dim (got dim=515)

Here are my parameters:

python3 /home/walsha94/OpenNMT-py/train.py -data ../trans/semi_fixed_test_en/EN_transformer_semi_fixed_test_en 
-save_model ../models/trans/semi_fixed_test_en/EN_transformer_semi_fixed_test_en 
-layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8 
-encoder_type transformer -decoder_type transformer -position_encoding 
-train_steps 200000 -max_generator_batches 2 -dropout 0.1 -batch_size 4096 
-batch_type tokens -normalization tokens -accum_count 2 -optim adam 
-adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 
-max_grad_norm 0 -param_init 0 -param_init_glorot -label_smoothing 0.1 
-valid_steps 10000 -save_checkpoint_steps 10000 -world_size 1 -gpu_ranks 0

francoishernandez · 2020-06-23T14:02:56Z

Hi there,
With the transformer architecture, you must have word_vec_size + feat_vec_size == rnn_size.
Here it can't be the case since rnn_size == word_vec_size, and feat_vec_size will be > 0.
The easiest here is probably to set word_vec_size to a lower value, and feat_vec_size to rnn_size - feat_vecs_size.
See here for instance: https://forum.opennmt.net/t/does-word-embedding-size-change-when-we-use-word-features/2785

eihe · 2020-06-23T14:20:24Z

Hi @francoishernandez, thanks for the help! I tried what you suggested, and this is the output I got:

python3 /home/walsha94/OpenNMT-py/train.py -data $NAME -save_model models/$NAME -layers 6 
-rnn_size 512 -word_vec_size 500 -feat_vec_size 12 -transformer_ff 2048 -heads 8 
-encoder_type transformer -decoder_type transformer -position_encoding -train_steps 200000  -max_generator_batches 2 -dropout 0.1 -batch_size 4096 -batch_type tokens -normalization tokens  -accum_count 2 -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 -max_grad_norm 0 -param_init 0  -param_init_glorot -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 -world_size 2 -gpu_ranks 0 1

[2020-06-23 15:15:43,945 INFO] Loading dataset from trans/semi_fixed_test_en/EN_transformer_semi_fixed_test_en.train.0.pt
[2020-06-23 15:15:44,133 INFO]  * src vocab size = 50002
[2020-06-23 15:15:44,134 INFO]  * src_feat_0 vocab size = 6
[2020-06-23 15:15:44,134 INFO]  * tgt vocab size = 50004
[2020-06-23 15:15:44,134 INFO] Building model...
/home/walsha94/OpenNMT-py/onmt/modules/embeddings.py:218: UserWarning: Not merging with sum and positive feat_vec_size, but got non-default feat_vec_exponent. It will be unused.
  warnings.warn("Not merging with sum and positive "
[2020-06-23 15:15:53,880 INFO] NMTModel(
  (encoder): TransformerEncoder(
    (embeddings): Embeddings(
      (make_embedding): Sequential(
        (emb_luts): Elementwise(
          (0): Embedding(50002, 500, padding_idx=1)
          (1): Embedding(6, 12, padding_idx=1)
        )
        (pe): PositionalEncoding(
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )

Henry-E · 2020-12-11T10:37:08Z

@vince62s Are there any plans to include source word features in the new data processing pipeline? They're widely used in many non-NMT applications such as NLG.

Henry-E · 2022-12-07T16:45:45Z

Wow, closed as completed!

vince62s · 2022-12-07T16:48:01Z

No @anderleich is working on it, but it may take time. Just cleaning up old issues.

vikrant97 mentioned this issue Sep 16, 2019

[WIP] Source Word Features in Translation #1564

Closed

noe mentioned this issue Sep 17, 2019

Vocabulary size cannot be specified for word features #1452

Closed

vince62s added type:docs type:enhancement type:feature labels May 9, 2020

vince62s closed this as completed Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word features in Translation #1534

Word features in Translation #1534

vikrant97 commented Aug 20, 2019

vikrant97 commented Aug 21, 2019 •

edited

Loading

vikrant97 commented Aug 22, 2019

vince62s commented Aug 23, 2019

vikrant97 commented Aug 23, 2019 •

edited

Loading

vince62s commented Aug 23, 2019

vikrant97 commented Aug 24, 2019 •

edited

Loading

vince62s commented Aug 24, 2019

eduamf commented Aug 28, 2019 via email

vikrant97 commented Aug 31, 2019

vikrant97 commented Sep 16, 2019

vince62s commented Sep 16, 2019

vikrant97 commented Sep 16, 2019 •

edited

Loading

Henry-E commented Nov 5, 2019

vince62s commented May 9, 2020

Henry-E commented May 12, 2020

eihe commented Jun 23, 2020

francoishernandez commented Jun 23, 2020

eihe commented Jun 23, 2020 •

edited

Loading

Henry-E commented Dec 11, 2020 •

edited

Loading

Henry-E commented Dec 7, 2022

vince62s commented Dec 7, 2022

Word features in Translation #1534

Word features in Translation #1534

Comments

vikrant97 commented Aug 20, 2019

vikrant97 commented Aug 21, 2019 • edited Loading

vikrant97 commented Aug 22, 2019

vince62s commented Aug 23, 2019

vikrant97 commented Aug 23, 2019 • edited Loading

vince62s commented Aug 23, 2019

vikrant97 commented Aug 24, 2019 • edited Loading

vince62s commented Aug 24, 2019

eduamf commented Aug 28, 2019 via email

vikrant97 commented Aug 31, 2019

vikrant97 commented Sep 16, 2019

vince62s commented Sep 16, 2019

vikrant97 commented Sep 16, 2019 • edited Loading

Henry-E commented Nov 5, 2019

vince62s commented May 9, 2020

Henry-E commented May 12, 2020

eihe commented Jun 23, 2020

francoishernandez commented Jun 23, 2020

eihe commented Jun 23, 2020 • edited Loading

Henry-E commented Dec 11, 2020 • edited Loading

Henry-E commented Dec 7, 2022

vince62s commented Dec 7, 2022

vikrant97 commented Aug 21, 2019 •

edited

Loading

vikrant97 commented Aug 23, 2019 •

edited

Loading

vikrant97 commented Aug 24, 2019 •

edited

Loading

vikrant97 commented Sep 16, 2019 •

edited

Loading

eihe commented Jun 23, 2020 •

edited

Loading

Henry-E commented Dec 11, 2020 •

edited

Loading