Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion error during training #24

Open
uk-ci-github opened this issue Feb 21, 2020 · 9 comments
Open

Assertion error during training #24

uk-ci-github opened this issue Feb 21, 2020 · 9 comments
Assignees

Comments

@uk-ci-github
Copy link

Hello,
I'm trying to reproduce the paper results, but I get an assertion error after few epochs while running "make train".
Do you have any suggestion?
Thanks!

Example 700
Original Source: b'Write unbuffered output of "python -u client.py" to standard output and to "logfile"'
Source: [b'w', b'r', b'i', b't', b'e', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K', b' ', b'o', b'u', b't', b'p', b'u', b't', b' ', b'o', b'f', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K', b' ', b'
t', b'o', b' ', b's', b't', b'a', b'n', b'd', b'a', b'r', b'd', b' ', b'o', b'u', b't', b'p', b'u', b't', b' ', b'a', b'n', b'd', b' ', b't', b'o', b' ', b'_', b'_', b'S', b'P', b'_', b'_', b'U', b'N', b'K']
GT Target 1: b'python -u client.py | tee logfile'
Prediction 1: b'' (0.0, 0)
Prediction 2: b'echo __SP__UNK | tee __SP__UNK' (0.36514837167011077, 0.2033717397090786)
Prediction 3: b'ls -l -R __SP__UNK | tee __SP__UNK' (0.3086066999241838, 0.18043239916836057)

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 374, in main
    eval(dataset, verbose=True)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 176, in eval
    return eval_tools.automatic_eval(prediction_path, dataset, top_k=3, FLAGS=FLAGS, verbose=verbose)
  File "/root/workspace/nl2bash/eval/eval_tools.py", line 249, in automatic_eval
    top_k, num_samples, verbose)
  File "/root/workspace/nl2bash/eval/eval_tools.py", line 361, in get_automatic_evaluation_metrics
    bleu = token_based.corpus_bleu_score(command_gt_asts_list, pred_ast_list)
  File "/root/workspace/nl2bash/eval/token_based.py", line 70, in corpus_bleu_score
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]                                                                                                       File "/root/workspace/nl2bash/eval/token_based.py", line 70, in <listcomp>
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]
  File "/root/workspace/nl2bash/eval/token_based.py", line 70, in <listcomp>
    gt_tokens_list = [[data_tools.bash_tokenizer(ast, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]                                                                                                       File "/root/workspace/nl2bash/bashlint/data_tools.py", line 58, in bash_tokenizer
    with_prefix=with_prefix, with_flag_argtype=with_flag_argtype)
  File "/root/workspace/nl2bash/bashlint/data_tools.py", line 250, in ast2tokens
    return to_tokens_fun(node)                                                                                                                                                                                                       File "/root/workspace/nl2bash/bashlint/data_tools.py", line 102, in to_tokens_fun
    assert(loose_constraints or node.get_num_of_children() == 1)
AssertionError
Makefile:41: recipe for target 'train' failed                                                                                                                                                                                      make: *** [train] Error 1
@todpole3
Copy link
Member

todpole3 commented Mar 5, 2020

The error is triggered by this line:
line: https://github.com/TellinaTool/nl2bash/blob/master/eval/token_based.py#L70

You may temporarily bypass this by changing line 70 to

gt_tokens_list = [[data_tools.bash_tokenizer(ast, loose_constraints=True, ignore_flag_order=True) for ast in gt_asts] for gt_asts in gt_asts_list]

The loose_constraints flag tells the Bash tokenizer to produce a tokenization despite of parse tree errors encountered. Without setting this flag it will throws an assertion.

It looks like something might be wrong with the data since this error shouldn't be triggered when parsing a ground truth command.

Looks like it got choked on this one

python -u client.py | tee logfile

I'm going to debug the bash_tokenizer further and get back to you.

@uk-ci-github
Copy link
Author

Thanks a lot for your suggestion, I'm gonna try it in the meanwhile.

@cjy-cc
Copy link

cjy-cc commented Mar 31, 2020

Hello sir. When I restored the code, I had the same problem. At the same time, I had modified loose_constraints, but I still got an error. I hope you can guide me. Thank you very much.

Makefile:41: recipe for target 'train' failed make: *** [train] Error 1

@uk-ci-github
Copy link
Author

Hello,
after the addition of loose_constraints=True I managed to successfully train 5 models out of 7.
Their performances are much lower than the one reported in NL2Bash's Table 15 for the automatic evaluation on the dev set.
I'm not sure if the BLEU score is correctly computed because I get suspicious warnings.
For example bash-token.sh --decode --gpu 0 reports

[...]
The hypothesis contains 0 counts of 4-gram overlaps.
Therefore the BLEU score evaluates to 0, independently of
how many N-gram overlaps of lower order it contains.
Consider using lower n-gram order or use SmoothingFunction()
[...]
The hypothesis contains 0 counts of 2-gram overlaps.
[...]
The hypothesis contains 0 counts of 3-gram overlaps.
[...]
701 examples evaluated
Top 1 Template Acc = 0.000
Top 1 Command Acc = 0.000
Average top 1 Template Match Score = 0.066
Average top 1 BLEU Score = 0.236
Top 3 Template Acc = 0.001
Top 3 Command Acc = 0.000
Average top 3 Template Match Score = 0.156
Average top 3 BLEU Score = 0.335
Corpus BLEU = 0.035

@uk-ci-github
Copy link
Author

bash-char.sh crashes at the end of the 1st epoch with this output

100%|█████████████████████████████████████| 4000/4000 [4:32:36<00:00,  4.09s/it]
Training loss = nan is too large.
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 353, in main
    train(train_set, dataset)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 111, in train
    raise graph_utils.InfPerplexityError
encoder_decoder.graph_utils.InfPerplexityError

@uk-ci-github
Copy link
Author

bash-copy-partial-token.sh crashes even before starting the training with this output

Bashlint grammar set up (124 utilities)

Reading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
Saving models to /root/workspace/etsiCuts/nl2bash/encoder_decoder/../model/seq2seq
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
9985 data points read.
[...]
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source vocabulary size = 1570
target vocabulary size = 1214
max source token size = 19
max target token size = 40
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
source tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.partial.token
target tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.partial.token
9985 data points read.
max_source_length = 181
max_target_length = 205
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 320, in main
    train_set, dev_set, test_set = data_utils.load_data(FLAGS, use_buckets=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 137, in load_data
    use_buckets=use_buckets, add_start_token=True, add_end_token=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 251, in read_data
    sc_copy_tokens, tg_copy_tokens, vocab.tg_vocab, token_ext)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 738, in compute_copy_indices
    assert(len(sc_tokens) == len(sc_copy_tokens))
AssertionError
Makefile:41: recipe for target 'train' failed

@todpole3 todpole3 self-assigned this Apr 3, 2020
@QuinVIVER
Copy link

I met the same problem, any idea how to fix this?

bash-copy-partial-token.sh crashes even before starting the training with this output

Bashlint grammar set up (124 utilities)

Reading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
Saving models to /root/workspace/etsiCuts/nl2bash/encoder_decoder/../model/seq2seq
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
9985 data points read.
[...]
Loading data from /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash
source vocabulary size = 1570
target vocabulary size = 1214
max source token size = 19
max target token size = 40
source file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.filtered
target file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.filtered
source tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.nl.partial.token
target tokenized sequence file: /root/workspace/etsiCuts/nl2bash/encoder_decoder/../data/bash/train.cm.partial.token
9985 data points read.
max_source_length = 181
max_target_length = 205
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/translate.py", line 320, in main
    train_set, dev_set, test_set = data_utils.load_data(FLAGS, use_buckets=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 137, in load_data
    use_buckets=use_buckets, add_start_token=True, add_end_token=True)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 251, in read_data
    sc_copy_tokens, tg_copy_tokens, vocab.tg_vocab, token_ext)
  File "/root/workspace/etsiCuts/nl2bash/encoder_decoder/data_utils.py", line 738, in compute_copy_indices
    assert(len(sc_tokens) == len(sc_copy_tokens))
AssertionError
Makefile:41: recipe for target 'train' failed

@QuinVIVER
Copy link

bash-char.sh crashes at the end of the 1st epoch with this output

100%|█████████████████████████████████████| 4000/4000 [4:32:36<00:00,  4.09s/it]
Training loss = nan is too large.
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 378, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 353, in main
    train(train_set, dataset)
  File "/root/workspace/nl2bash/encoder_decoder/translate.py", line 111, in train
    raise graph_utils.InfPerplexityError
encoder_decoder.graph_utils.InfPerplexityError

met this too.

@NingYueran
Copy link

Hello, have you solve this problem(AssertionError)? And I tried many times but failed. Please give me your solutions. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants