Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-order CUDA follow-up fix: do not use NVTX #162

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

griwodz
Copy link
Member

@griwodz griwodz commented Aug 12, 2024

Description

Remove use of NVTX from the develop branch. This is a profiling feature that is not needed for everybody out there. People who know about it can add the relevant lines themselves.

It is currently very troublesome because the transition from the NVTX library to header-only NVTX3 is complete on Windows, where NVTX has been removed, while NVTX3 does not even exist on Tegra, Jetson and maybe other Arm-based platforms. The removal is OK because NVTX is only relevant for very fine-grained debugging of the interplay between CPU and GPU. Its timing features can also be achieved by cudaEvents.

@griwodz griwodz added type:bug in progress cuda issues related to cuda versions bugfix labels Aug 12, 2024
@griwodz griwodz self-assigned this Aug 12, 2024
@griwodz griwodz changed the title First-order CUDA follow-up fixes First-order CUDA follow-up fix: troublesome transition to nvtx3 Aug 12, 2024
@griwodz griwodz force-pushed the dev/cmake-lang-cuda branch 3 times, most recently from ac217ae to 5c37b81 Compare August 12, 2024 10:08
@griwodz griwodz added ready and removed in progress labels Aug 12, 2024
@griwodz
Copy link
Member Author

griwodz commented Aug 12, 2024

Seems to be confirmed in #161 that the PR fixes the problem. #161 is also proposing a few updates in the vcpkg portfile.

@griwodz griwodz changed the title First-order CUDA follow-up fix: troublesome transition to nvtx3 First-order CUDA follow-up fix: do not use NVXT Aug 14, 2024
@griwodz griwodz changed the title First-order CUDA follow-up fix: do not use NVXT First-order CUDA follow-up fix: do not use NVTX Aug 14, 2024
This make trouble for continuous integration and is apparently not supported on all platforms.
Since it is a debug function, it's just as well to remove it from the mainstream tree.
@griwodz griwodz force-pushed the dev/cmake-lang-cuda branch from 7435c0f to abef1d4 Compare August 15, 2024 06:30
@griwodz
Copy link
Member Author

griwodz commented Aug 15, 2024

CUDA is always able to provide the timing of CUDA kernels running on the GPU to tools that observe this, either debuggers or performance analyzers. It is usually not able to register the timing and occupancy of threads on the CPU. NVTX is a mechanism that allows CPU code to add timestamps of its running threads into the same system.

That makes it possible to discover whether the CPU is working on something while the GPU is working on something else. It is not foolproof because NVTX knows nothing about the CPU's "occupancy" (whether it actually does something or is sleeping). But it can help to understand the communication between CPU and GPU better.

NVidia promises that it costs "nearly nothing", and I don't know how it actually compares to something like prof. It does create output for very nice performance analysers.

But it is most certainly not necessary to have it in the develop branch for everybody.

So while the transition from NVTX (version 2) to NVTX3 is ongoing and leads to cross-platform problems, I'd prefer to remove it entirely from the develop branch. There's probably nobody else than I who uses it anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix cuda issues related to cuda versions ready type:bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] Cannot built in vcpkg - MSVS2022 + Cuda 12.6
1 participant