-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Enable LLVM NVPTX target only if ROOT's cmake option 'cuda' is enabled #20225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Enable LLVM NVPTX target only if ROOT's cmake option 'cuda' is enabled #20225
Conversation
In order to reduce binary size and compile time, we add LLVM's NVPTX target only if ROOT's cmake option 'cuda' was enabled. See: root-project#20208 Add null object definition for IncrementalCUDADeviceCompiler to avoid that the client code of this class needs to be wrapped in #ifdefs, in order to compile.
interpreter/cling/lib/Interpreter/IncrementalCUDADeviceCompiler.cpp
Outdated
Show resolved
Hide resolved
Test Results 22 files 22 suites 3d 15h 1m 45s ⏱️ Results for commit 883eb20. ♻️ This comment has been updated with latest results. |
|
Do I understand correctly that if the tests pass, we will merge this change? |
|
and, btw, what is the improvement in size and compile time? |
interpreter/cling/include/cling/Interpreter/IncrementalCUDADeviceCompiler.h
Outdated
Show resolved
Hide resolved
interpreter/cling/include/cling/Interpreter/IncrementalCUDADeviceCompiler.h
Outdated
Show resolved
Hide resolved
I didn't measure, but I would expect only little: the backends are a very small part of the entire LLVM libraries, the generic infrastructure, optimization passes and JIT functionality is much bigger. |
I've mentioned privately that there is a little point in this change even if requested by an experiment. I do not oppose it but I do not see a benefit either except for adding some more conditionals in the codebase.. |
interpreter/cling/lib/Interpreter/IncrementalCUDADeviceCompiler.cpp
Outdated
Show resolved
Hide resolved
|
Including cc: @guitargeek |
Why? It is included by a number of headers and installed as part of LLVM. |
I agree with Vassil. Before merging this we ought to measure the change in size and time as the stated motivation in #20208 is: If this PR does not accomplish this, it sounds like the extra complication is not worth it. TLDR: Please measure the gains :) |
|
This PR reduces the size of ROOT's install directory for a release build by a mere Due to your comments, the complication is now quite small. So I would argue that it would be worth merging. |
|
My problem with merging this is that we lose the ability to call cling -cuda for such builds, not a big deal if we have diagnostics for the case but that makes the distribution less flexible for not a real impact… I will let other decide though… |
devajithvs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
pcanal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also propose to close this PR due to the lower than expected positive impact.
|
Maybe related: #20231 |
hahnjo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the small changes required and that #20231 proposes a less correct approach, I personally think we should merge this even if the gain is small. Maybe we should update the commit message with a different motivation (and squash the commits). For me, that would be the fact that CUDA in Cling is an optional feature that we know doesn't properly work from inside ROOT (with modules). I am less worried about standalone cling since there the user has to actively work to turn off backends.
| return; | ||
| } | ||
|
|
||
| #if LLVM_HAS_NVPTX_TARGET == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #if LLVM_HAS_NVPTX_TARGET == 1 | |
| #if LLVM_HAS_NVPTX_TARGET |
| LLVMInitializeNVPTXAsmPrinter(); | ||
|
|
||
| m_Init = true; | ||
| #endif // #if LLVM_HAS_NVPTX_TARGET == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #endif // #if LLVM_HAS_NVPTX_TARGET == 1 | |
| #endif // LLVM_HAS_NVPTX_TARGET |
| endif() | ||
|
|
||
| if(NOT "${ROOT_CLING_TARGET}" STREQUAL "all") | ||
| if(cuda) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to keep the previous condition:
| if(cuda) | |
| if(cuda AND NOT "${ROOT_CLING_TARGET}" STREQUAL "all") |
In order to reduce binary size and compile time, we add the LLVM's NVPTX target only if ROOT's cmake option 'cuda' was enabled. See: #20208
Add null object definition for IncrementalCUDADeviceCompiler to avoid that the client code of this class needs to be wrapped in #ifdefs, in order to compile.