Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classification speed #134

Open
neobarney opened this issue Jun 26, 2017 · 14 comments
Open

classification speed #134

neobarney opened this issue Jun 26, 2017 · 14 comments

Comments

@neobarney
Copy link

Hello, while running crfasrnn_demo.py, i'm not able to go below 4s on a K80 or P100.
It's running on GPU I'm sure (I checked memory allocations).

Seems pretty slow to me, does anyone got any idea on how I could speed up the classification ? Thanks

@bittnt
Copy link
Collaborator

bittnt commented Jun 26, 2017

With a Titan X Pascal, running crfasrnn_demo.py takes 0.4 seconds per image (500x500) in GPU mode.

@neobarney
Copy link
Author

@bittnt thanks are you using crfasrnn_demo.py or a custom script? are you running on docker?

@neobarney
Copy link
Author

you meant 500x500 ?

@bittnt
Copy link
Collaborator

bittnt commented Jun 26, 2017

500x500 is the image resolution. I was using crfasrnn_demo.py script.

@neobarney
Copy link
Author

neobarney commented Jun 27, 2017

@bittnt
compilation works without using cudnn, but fails if using it. My previous install was without cudnn, which expains the slow speed.
So I'm trying to compile the modified version of caffe with CUDA 8 and CUDNN6
Which version did you use in your setup?

@neobarney
Copy link
Author

@bittnt Finally I'm able to compile it and run the test successfully with cuda 7 and cudnn 3 only (all other up versions fail), but the speed is really low compare to yours (4.2 s) for the default image
How could you achieve such speed ? Would you mind sharing you cuda and cudnn versions ? thanks :)

@bittnt
Copy link
Collaborator

bittnt commented Jun 29, 2017

I have tested this under CUDA8, CUDA7.5, CUDA6.5, and CUDA6.

@neobarney
Copy link
Author

neobarney commented Jun 29, 2017

@bittnt Great to know :) do you compile using CUDNN or not ?
Yesterday I managed to compile with cudnn 3 and cuda 7, on K80 the speed was the same (4.2s) as without cudnn.
Impossible to go below 4s neither on K80 nor on P100.
Any idea about where the problem might come from ?

@bittnt
Copy link
Collaborator

bittnt commented Jun 29, 2017

I am not sure what is the problem. The speed you reported sounds like running the whole FCN-8s+crfasrnn on CPU rather than GPUs. Also, check the version of the code you are using.

I had tested the code on K80, both in AWS and Google Cloud before, it should take less than 1 second at least on the image with resolution 500x500.

@akashdexati
Copy link

akashdexati commented Oct 10, 2017

@bittnt
You wrote - "I have tested this under CUDA8, CUDA7.5, CUDA6.5, and CUDA6"

Can you please tell which version of cuDNN you used with CUDA8 ?
I am asking this because only cuda7+cuDNN3 worked for me. For rest all it gave some or other error.

@bittnt
Copy link
Collaborator

bittnt commented Nov 14, 2017

@akashdexati I think the error should be resolved if you use the crfasrnn branch (https://github.com/torrvision/caffe/tree/crfrnn) of the code rather than master. You do need to change the prototxt a bit for using the new branch. The new branch code merges the CRFasRNN layer with the latest Caffe, which supports CUDA8 and latest CUDNN.

@akashdexati
Copy link

@bittnt
I donot see anything like crfasrnn branch. Can you please help me with that branch.

@bittnt
Copy link
Collaborator

bittnt commented Nov 15, 2017

@akashdexati
Copy link

@bittnt I worked on the crfrnn branch but still CUDA8 + cudNN(5/5.1/6/7) nothing worked.
If it worked for you can you please share the cudnn.hpp file which worked for CUDA8 + cudnn(x).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants