Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPGPU-sim with cuDNN #112

Open
bigwater opened this issue Apr 11, 2019 · 11 comments
Open

GPGPU-sim with cuDNN #112

bigwater opened this issue Apr 11, 2019 · 11 comments

Comments

@bigwater
Copy link

bigwater commented Apr 11, 2019

Environment
Ubuntu 16.04.4 LTS
gcc/g++ 5.4.0 20160609
Python 2.7
CUDA 8.0
cuDNN 7.1.4
gpgpu-sim_distribution dev branch
pytorch-gpgpu-sim (removed the git dependencies of nervanagpu @ d4eefd5, since the repo is no longer there)

PYTORCH_BIN /usr/lib/x86_64-linux-gnu/libcudnn.so

The configurations used are from the configs folder of gpgpu-sim dev branch

MNIST
I use the MNIST sample from here. (Following deval281shah 's suggestions in another discussion. )
https://github.com/gpgpu-sim/gpgpu-sim_simulations

Config
I use TITANV config.
I also tried TITANX config, but a deadlock happened with that configuration.

The simulation runs for 39 minutes, and I checked it has stimulated a number of kernels, and it reported some related information such as IPC.

In the beginning, it generated a large amount of .. it cannot find all device function required.

Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_ct
Warning: cannot find deviceFun maxwell_zgemmBatched_64x32_raggedMn_ct
Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_cn
Warning: cannot find deviceFun maxwell_zgemmBatched_64x32_raggedMn_cn
Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_tc
Warning: cannot find deviceFun maxwell_zgemmBatched_64x32_raggedMn_tc
Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_tt
Warning: cannot find deviceFun maxwell_zgemmBatched_64x32_raggedMn_tt
Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_tn
Warning: cannot find deviceFun maxwell_zgemmBatched_64x32_raggedMn_tn
Warning: cannot find deviceFun maxwell_zgemmBatched_32x32_raggedMn_nc

Therefore, cudaLaunchKernel fails to find the device function.

However, it ends up with
...

GPGPU-Sim PTX: Setting up arguments for 4 bytes starting at 0x7ffc0b96aa68..

GPGPU-Sim PTX: cudaLaunch for 0x0x4321f0 (mode=performance simulation) on stream 0
GPGPU-Sim PTX: ERROR launching kernel -- no PTX implementation found for 0x4321f0

Does anyone encounter this problem before? Any suggestions?

Thank you so much for your help.

@RedCarrottt
Copy link
Contributor

RedCarrottt commented Apr 13, 2019

Hi,
I did not met this issue. In my case, loading ptxinfo and extracting ptx files were going well.

Did you set CUDA_INSTALL_PATH? GPGPU-Sim may call the ptxas command based on the value.

@bigwater bigwater changed the title Fail to run pytorch-gpgpu-sim no PTX implementation when GPGPU-sim runs cuDNN Apr 14, 2019
@bigwater
Copy link
Author

Hi,
I did not met this issue. In my case, loading ptxinfo and extracting ptx files were going well.

Did you set CUDA_INSTALL_PATH? GPGPU-Sim may call the ptxas command based on the value.

Thank you. I have fixed the ptxas problem. (Although I do not know how did it get fixed by only switching a computer. )

Have you tried the MNIST example in https://github.com/gpgpu-sim/gpgpu-sim_simulations? Seems the developers use this app for testing... But finally, I got a PTX not implementation error.

@bigwater bigwater changed the title no PTX implementation when GPGPU-sim runs cuDNN GPGPU-sim with cuDNN Apr 14, 2019
@RedCarrottt
Copy link
Contributor

RedCarrottt commented Apr 15, 2019

As you commented, I tried gpgpu-sim/gpgpu-sim_simulations.
When I run the mnistCUDNN, ptx extraction and some implicit_convolve_sgemm call work well. I could not find PTX not implementation error.
However, I've met a deadlock and the simulator has finally been dead as following message.

GPGPU-Sim uArch: Shader 32 bind to kernel 1 '_ZN5cudnn6detail23implicit_convolve_sgemmIffLi128ELi5ELi5ELi3ELi3ELi3ELi1ELb1ELb0ELb1EEEviiiPKT_iPT0_PS2_18kernel_conv_paramsiffiS6_S6_ii'
  <CTA alloc> : sm_idx=32 sid=32 max_cta_per_sm=16
GPGPU-Sim uArch: Shader 34 bind to kernel 1 '_ZN5cudnn6detail23implicit_convolve_sgemmIffLi128ELi5ELi5ELi3ELi3ELi3ELi1ELb1ELb0ELb1EEEviiiPKT_iPT0_PS2_18kernel_conv_paramsiffiS6_S6_ii'
  <CTA alloc> : sm_idx=34 sid=34 max_cta_per_sm=16
GPGPU-Sim uArch: ERROR ** deadlock detected: last writeback core 34 @ gpu_sim_cycle 12492 (+ gpu_tot_sim_cycle 4294867296) (87508 cycles ago)
GPGPU-Sim uArch: DEADLOCK  shader cores no longer committing instructions [core(# threads)]:
GPGPU-Sim uArch: DEADLOCK  0(64) 1(0) 2(64) 3(0) 4(64) 5(0) 6(64) 7(0)  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...
Re-run the simulator in gdb and use debug routines in .gdbinit to debug this
Aborted (core dumped)

@bigwater
Copy link
Author

As you commented, I tried gpgpu-sim/gpgpu-sim_simulations.
When I run the mnistCUDNN, ptx extraction and some implicit_convolve_sgemm call work well. I could not find PTX not implementation error.
However, I've met a deadlock and the simulator has finally been dead as following message.

GPGPU-Sim uArch: Shader 32 bind to kernel 1 '_ZN5cudnn6detail23implicit_convolve_sgemmIffLi128ELi5ELi5ELi3ELi3ELi3ELi1ELb1ELb0ELb1EEEviiiPKT_iPT0_PS2_18kernel_conv_paramsiffiS6_S6_ii'
  <CTA alloc> : sm_idx=32 sid=32 max_cta_per_sm=16
GPGPU-Sim uArch: Shader 34 bind to kernel 1 '_ZN5cudnn6detail23implicit_convolve_sgemmIffLi128ELi5ELi5ELi3ELi3ELi3ELi1ELb1ELb0ELb1EEEviiiPKT_iPT0_PS2_18kernel_conv_paramsiffiS6_S6_ii'
  <CTA alloc> : sm_idx=34 sid=34 max_cta_per_sm=16
GPGPU-Sim uArch: ERROR ** deadlock detected: last writeback core 34 @ gpu_sim_cycle 12492 (+ gpu_tot_sim_cycle 4294867296) (87508 cycles ago)
GPGPU-Sim uArch: DEADLOCK  shader cores no longer committing instructions [core(# threads)]:
GPGPU-Sim uArch: DEADLOCK  0(64) 1(0) 2(64) 3(0) 4(64) 5(0) 6(64) 7(0)  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...  + others ...
Re-run the simulator in gdb and use debug routines in .gdbinit to debug this
Aborted (core dumped)

Which configuration files are you using? When I used TITANX (Maxwell), I also got a deadlock. Could you try TITANV config instead?

Seems in your log file, you also got "Warning: cannot find deviceFun" --- I guess the problem is gpgpusim failed to find some device functions. (may be they do not belong to cuDNN? will check later. )

Thank you.

@RedCarrottt
Copy link
Contributor

Hello, I've found that mnistCUDNN require both libcudnn and libcublas, but GPGPU-Sim support extracting PTX files from only one library(its path is defined at PYTORCH_BIN variable).

After I modified GPGPU-Sim so that it supports extracting PTX files from multiple libraries.

I uploaded my pull request(PR #116).

Although mnistCUDNN calls several kernels for fermi or maxwell architecture(ex. cudnn7maxwell4gemm) with volta architecture config file, anyway it works well.

@cng123
Copy link
Contributor

cng123 commented Apr 18, 2019

Was mnistCUDNN linked statically with libcudnn and libcublas? I believe that was a requirement in order to account for gpgpu-sim only being able to extract PTX files from 1 file. The Makefile in deval281shah's mnistCUDNN example should be passing flags to statically link those libraries.

@masa-laboratory
Copy link

@RedCarrottt @bigwater
Hello,

when I test minist/main.py, the output GPGPU-Sim PTX: Parsing libcaffe2_ gpu.1.sm_ 61.ptx lasted a long time and hasn't stopped yet. Have you ever encountered this problem when running this? Is it a mistake or does it take so long to handle PTX itself.

Thanks.

@jidle123
Copy link

jidle123 commented Aug 6, 2024

你好

当我测试时,输出持续了很长时间并且还没有停止。您在运行此软件时是否遇到过此问题?这是一个错误还是处理 PTX 本身需要很长时间。minist/main.py``GPGPU-Sim PTX: Parsing libcaffe2_ gpu.1.sm_ 61.ptx

谢谢。

您好,请问这个问题解决了吗?我遇到了跟你一样的问题,一直在输出libcaffe2_gpu.xx.sm_61.ptx,根本停不下来

@jidle123
Copy link

jidle123 commented Aug 6, 2024

@RedCarrottt @bigwater Hello,

when I test minist/main.py, the output GPGPU-Sim PTX: Parsing libcaffe2_ gpu.1.sm_ 61.ptx lasted a long time and hasn't stopped yet. Have you ever encountered this problem when running this? Is it a mistake or does it take so long to handle PTX itself.

Thanks.

Hi, Have you deal it yet?

@itsMaoMao
Copy link

你好
当我测试时,输出持续了很长时间并且还没有停止。您在运行此软件时是否遇到过此问题?这是一个错误还是处理 PTX 本身需要很长时间。 minist/main.pyGPGPU-Sim PTX: Parsing libcaffe2_ gpu.1.sm_ 61.ptx ``
谢谢。

您好,请问这个问题解决了吗?我遇到了跟你一样的问题,一直在输出libcaffe2_gpu.xx.sm_61.ptx,根本停不下来

ptx文件生成等待一点时间(我是10分钟-20分钟之间)他会完成的,虽然我之后的运行中遇到一些问题,但是正确的代码段ptx文件这个时间内可以完成生成。btw你成功运行了吗?

@jidle123
Copy link

jidle123 commented Oct 31, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants