You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello!
I found a problem about seg loss in training with my own dataset. My segment datasets were converted to "L". In ori_big.py, model would predict segment with size[x, 2, x, x]. But I got error when training was at CrossEntropyLoss2d. Can you give some help? Thanks!
Error:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [189,0,0] Assertion t >= 0 && t < n_classes failed.
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED (createCuDNNHandle at /pytorch/aten/src/ATen/cudnn/Handle.cpp:9)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f81564a5536 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: + 0x10a0c28 (0x7f81579a1c28 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #2: at::native::getCudnnHandle() + 0xe54 (0x7f81579a3404 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #3: + 0xf19f4c (0x7f815781af4c in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xf1afe1 (0x7f815781bfe1 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: + 0xf1f01b (0x7f815782001b in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #6: at::native::cudnn_convolution_backward_input(c10::ArrayRef, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool) + 0xb2 (0x7f8157820572 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xf86090 (0x7f8157887090 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #8: + 0xfca928 (0x7f81578cb928 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #9: at::native::cudnn_convolution_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, std::array<bool, 2ul>) + 0x4fa (0x7f8157821c0a in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #10: + 0xf863bb (0x7f81578873bb in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #11: + 0xfca984 (0x7f81578cb984 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #12: + 0x2c80736 (0x7f8191037736 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: + 0x2ccff44 (0x7f8191086f44 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: torch::autograd::generated::CudnnConvolutionBackward::apply(std::vector<at::Tensor, std::allocatorat::Tensor >&&) + 0x378 (0x7f8190c4f908 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: + 0x2d89705 (0x7f8191140705 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #16: torch::autograd::Engine::evaluate_function(std::shared_ptrtorch::autograd::GraphTask&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f819113da03 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::Engine::thread_main(std::shared_ptrtorch::autograd::GraphTask const&, bool) + 0x3d2 (0x7f819113e7e2 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #18: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f8191136e59 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #19: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f819da7e968 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #20: + 0xc819d (0x7f81ac9d019d in /home/derek/anaconda3/envs/jim/bin/../lib/libstdc++.so.6)
frame #21: + 0x76db (0x7f81ae1696db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #22: clone + 0x3f (0x7f81ade9271f in /lib/x86_64-linux-gnu/libc.so.6)
The text was updated successfully, but these errors were encountered:
Hello!
I found a problem about seg loss in training with my own dataset. My segment datasets were converted to "L". In ori_big.py, model would predict segment with size[x, 2, x, x]. But I got error when training was at CrossEntropyLoss2d. Can you give some help? Thanks!
Error:
/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:106: cunn_SpatialClassNLLCriterion_updateOutput_kernel: block: [4,0,0], thread: [189,0,0] Assertion
t >= 0 && t < n_classes
failed.RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED (createCuDNNHandle at /pytorch/aten/src/ATen/cudnn/Handle.cpp:9)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x46 (0x7f81564a5536 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: + 0x10a0c28 (0x7f81579a1c28 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #2: at::native::getCudnnHandle() + 0xe54 (0x7f81579a3404 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #3: + 0xf19f4c (0x7f815781af4c in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0xf1afe1 (0x7f815781bfe1 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #5: + 0xf1f01b (0x7f815782001b in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #6: at::native::cudnn_convolution_backward_input(c10::ArrayRef, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool) + 0xb2 (0x7f8157820572 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0xf86090 (0x7f8157887090 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #8: + 0xfca928 (0x7f81578cb928 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #9: at::native::cudnn_convolution_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, std::array<bool, 2ul>) + 0x4fa (0x7f8157821c0a in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #10: + 0xf863bb (0x7f81578873bb in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #11: + 0xfca984 (0x7f81578cb984 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #12: + 0x2c80736 (0x7f8191037736 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: + 0x2ccff44 (0x7f8191086f44 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: torch::autograd::generated::CudnnConvolutionBackward::apply(std::vector<at::Tensor, std::allocatorat::Tensor >&&) + 0x378 (0x7f8190c4f908 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: + 0x2d89705 (0x7f8191140705 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #16: torch::autograd::Engine::evaluate_function(std::shared_ptrtorch::autograd::GraphTask&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7f819113da03 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::Engine::thread_main(std::shared_ptrtorch::autograd::GraphTask const&, bool) + 0x3d2 (0x7f819113e7e2 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #18: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f8191136e59 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #19: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f819da7e968 in /home/derek/anaconda3/envs/jim/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #20: + 0xc819d (0x7f81ac9d019d in /home/derek/anaconda3/envs/jim/bin/../lib/libstdc++.so.6)
frame #21: + 0x76db (0x7f81ae1696db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #22: clone + 0x3f (0x7f81ade9271f in /lib/x86_64-linux-gnu/libc.so.6)
The text was updated successfully, but these errors were encountered: