Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cudnn Error when running retro-baselines jobs (Nvidia-cuda docker error) #7

Closed
floodsung opened this issue Apr 9, 2018 · 5 comments

Comments

@floodsung
Copy link

floodsung commented Apr 9, 2018

I got such error when running a retro-baselines ppo2 job (directly using ppo2.docker):
ted and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead WARNING:tensorflow:From /root/venv/lib/python3.5/site-packages/baselines/common/distributions.py:147: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2018-04-09 04:27:06.399347: E tensorflow/stream_executor/cuda/cuda_dnn.cc:396] Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration. 2018-04-09 04:27:06.399956: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)

It seems that the test server installed an unsuitable cudnn version (Tensorflow requires cudnn 7.0).

@floodsung
Copy link
Author

floodsung commented Apr 9, 2018

I just found that nvidia-docker updated the cudnn version to 7.1 a week ago

@floodsung floodsung changed the title Cudnn Error when running retro-baselines jobs Cudnn Error when running retro-baselines jobs (Nvidia-cuda docker error) Apr 9, 2018
@endrift
Copy link
Contributor

endrift commented Apr 9, 2018

I put the wrong issue number in the commit, but this should be fixed. Let me know if you encounter more errors.

@floodsung
Copy link
Author

Still the same error today!

@endrift
Copy link
Contributor

endrift commented Apr 10, 2018

You'll need to pull openai/retro-agent again and then try rebuilding.

@floodsung
Copy link
Author

thanks, it is ok now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants