Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when training #103

Open
anbasile opened this issue Feb 13, 2018 · 0 comments
Open

Error when training #103

anbasile opened this issue Feb 13, 2018 · 0 comments

Comments

@anbasile
Copy link

I am using seq2seq-attn to go from AMRs to English.

I am using this command:

th train.lua -data_file data/delexichar-train.hdf5 -val_data_file data/delexichar-val.hdf5 -savefile delexichar -use_chars_enc 1 -use_chars_dec 1 -gpuid 1 -save_every 5 -cudnn 1

And I get the following error:

using CUDA on GPU 1...
loading cudnn...
loading data...
done!
Source vocab size: 33296, Target vocab size: 47527
Source max sent len: 336, Target max sent len: 338                                                                                        [16/1668]
Number of additional features on source side: 0
Switching on memory preallocation
Number of parameters: 43885727 (active: 43885727)
/home/ubuntu/src/torch/install/bin/luajit: bad argument #2 to '?' (end index out of bound)
stack traceback:
        [C]: at 0x7f957c1aa210
        [C]: in function '__index'
        train.lua:395: in function 'train_batch'
        train.lua:750: in function 'train'
        train.lua:1091: in function 'main'
        train.lua:1094: in main chunk
        [C]: in function 'dofile'
        .../src/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00405d50

I am running this on a p2.xlarge machine on AWS. Any ideas on how to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant