Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem in main (refactoring branch) #13

Open
hdmetor opened this issue Nov 18, 2015 · 10 comments
Open

Problem in main (refactoring branch) #13

hdmetor opened this issue Nov 18, 2015 · 10 comments

Comments

@hdmetor
Copy link

hdmetor commented Nov 18, 2015

th main.lua -algo RCNN -backend cudnn gives me the following error on the refactoring branch:

nn.CrossEntropyCriterion
==> Converting model to CUDA
/home/ubuntu/torch/install/bin/luajit: /home/ubuntu/object-detection.torch/data.lua:28: attempt to index field 'algo' (a nil value)
stack traceback:
    /home/ubuntu/object-detection.torch/data.lua:28: in main chunk
    [C]: in function 'dofile'
    main.lua:33: in main chunk
    [C]: in function 'dofile'
    ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
    [C]: at 0x00406670
@fmassa
Copy link
Owner

fmassa commented Nov 18, 2015

Hi,
Thanks for pointing this out.
For the moment, if you want to train/test using RCNN, check the examples/train_test_rcnn.lua . I still need to provide a model for it though (or a way to load pre-trained models)
I was not paying much attention to the main.lua script because I was thinking about only having one file for each framework in the examples folder, but I'll think about it again. I'll leave this open until I come up with a solution (either remove the main.lua and point to the examples, or fix it).

@Ethiral
Copy link

Ethiral commented Apr 13, 2016

th train_test_rcnn.lua gives me the error:
Using GPU mode on device 1
Using fixed seed: 1
/usr/bin/luajit: /home/ekanshv/models/zeiler.lua:30: attempt to call local 'spatialconv' (a nil value)
stack traceback:
/home/ekanshv/models/zeiler.lua:30: in function 'createModel'
train_test_rcnn.lua:53: in main chunk
[C]: in function 'dofile'
/usr/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

@fmassa
Copy link
Owner

fmassa commented Apr 13, 2016

@Ethiral I've added a quick fix for your problem in commit a0a4a51 .
For now, you need to add the pre-trained model path in the argument of the script

I will revamp this repo soon, with simple examples and packaging it as a package. Plus, I have some improvements on the code that I still need to push. I'll hopefully do it in the coming weeks.

@longwoo
Copy link

longwoo commented Aug 2, 2016

hi,th main.lua gives me the error:
==> Preparing BatchProvider for validation
/home/wulong/torch/install/bin/luajit: ./DataSetPascal.lua:222: Need to specify the bounding boxes file
stack traceback:
[C]: in function 'assert'
./DataSetPascal.lua:222: in function 'loadROIDB'
./DataSetPascal.lua:315: in function 'attachProposals'
./BatchProvider.lua:73: in function 'setupData'
/home/wulong/object-detection.torch/data.lua:96: in main chunk
[C]: in function 'dofile'
main.lua:33: in main chunk
[C]: in function 'dofile'
...long/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

so how can I get the bounding boxes file and where should I put it ? Thank you.

@fmassa
Copy link
Owner

fmassa commented Aug 2, 2016

@longwoo you can use whatever region proposal algorithm you want.
For example, you can use Selective Search, and the link for downloading the proposals can be found here.
This code supposes the same bounding box format as the one in the link I just sent.
You can put them anywhere, but the default location of main.lua in the current master branch is data/selective_search_data , as can be seen in this part of the code.

@longwoo
Copy link

longwoo commented Aug 3, 2016

Thanks. But after done that (downloading the Selective Search files you mentioned and put into the right place) and run th main.lua -algo SPP -gpu 2 -seed 1 ,it gives me the error:

=> Creating model from file: models/zeiler.lua  
=> Criterion    
==> Converting model to CUDA    
Loading train metadata from cache   
Loading test metadata from cache    
Preparing conv5 features for VOC2007 trainval   
 [========== 5011/5011 =========>]  Tot: 2s104ms | Step: 0ms     
Preparing conv5 features for VOC2007 test   
Iteration: 1/300    
==> Preparing Batch Data    
 [========== 3476/3476 ====>]  Tot: 1m11s | Step: 21ms      
==> Training zeiler,seed=1  
 [========== 500/500 ===========>]  Tot: 21s830ms | Step: 42ms   
==> Training Error: 0.64886541676521    
ConfusionMatrix:
 + average row correct: 36.375306085462% 
 + average rowUcol correct (VOC measure): 29.776913725904% 
 + global correct: 83.2015625%
/home/wulong/torch/install/bin/luajit: bad argument #1 to '?' (must be strictly positive at /home/wulong/torch/pkg/torch/lib/TH/generic/THTensorMath.c:1420)
stack traceback:
    [C]: at 0x7f7340d95d20
    [C]: in function 'randperm'
    ./BatchProvider.lua:114: in function 'permuteIdx'
    ./BatchProvider.lua:245: in function 'getBatch'
    ./Tester.lua:33: in function 'validate'
    main.lua:76: in main chunk
    [C]: in function 'dofile'
    ...long/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

It seems that I still missing something.

@fmassa
Copy link
Owner

fmassa commented Aug 3, 2016

@longwoo It seems that you have not provided a test dataset. If you download Pascal VOC test dataset and put it in the datasets/VOCdevkit folder, it should work.

@longwoo
Copy link

longwoo commented Aug 8, 2016

In the refactoring branch I run th train_test_rcnn.lua,it gives

wulong@PVG-Dsk-004:~/object-detection.torch-refactoring$ th train_test_rcnn.lua 
-- ignore option gpu    
-- ignore option name   
-- ignore option modelpath  
-- ignore option numthreads 
[program started on Mon Aug  8 15:58:59 2016]   
[command line arguments]    
gpu 2   
seed    1   
name    rcnn-example    
save_step   100 
lr  0.001   
modelpath   /home/wulong/object-detection.torch-refactoring/data/models/frcnn_alexnet.t7    
numthreads  6   
num_iter    400 
disp_iter   1   
lr_step 300 
[----------------------]    
Using GPU mode on device 2  
Using fixed seed: 1 
/home/wulong/torch/install/bin/luajit: train_test_rcnn.lua:76: attempt to call method 'type' (a nil value)
stack traceback:
    train_test_rcnn.lua:76: in main chunk
    [C]: in function 'dofile'
    ...long/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

Now I am using "frcnn_alexnet.t7" as pre-trained model. And I also tried "Zeiler_imagenet_weights.mat". It seems not right. So where can I get the exact pre-trained model? And if I want to train fast-rcnn,should I use different one? Thank you.

@fmassa
Copy link
Owner

fmassa commented Aug 9, 2016

@longwoo if you want to train Fast R-CNN, you should use a model similar to frcnn_alexnet.t7, which is already finetuned for Pascal.
About the error you are seeing, I couldn't find in the current code a matching line that corresponds to your error in line 76. Are you sure that the model was loaded properly?

Also, I'll be soon pushing a new repo which uses this one, it will hopefully be a good starting point for using this code.

@longwoo
Copy link

longwoo commented Aug 9, 2016

I'm looking forward to it. ^.^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants