Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

Merged
merged 3 commits into from
Jan 7, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jan 6, 2025

Summary:
This pass lowers the __nvvm_reflect builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in libc.

@llvmbot
Copy link
Member

llvmbot commented Jan 6, 2025

@llvm/pr-subscribers-backend-nvptx

Author: Joseph Huber (jhuber6)

Changes

Summary:
This pass lowers the __nvvm_reflect builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in libc.


Full diff: https://github.com/llvm/llvm-project/pull/121834.diff

6 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp (-1)
  • (modified) llvm/lib/Target/NVPTX/NVVMReflect.cpp (+7-1)
  • (modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll (+2-2)
  • (modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll (+2-2)
  • (modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll (+3-3)
  • (modified) llvm/test/CodeGen/NVPTX/nvvm-reflect.ll (+4-3)
diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index b3b2880588cc59..f6ec780d963d9a 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -255,7 +255,6 @@ void NVPTXTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
   PB.registerPipelineStartEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
         FunctionPassManager FPM;
-        FPM.addPass(NVVMReflectPass(Subtarget.getSmVersion()));
         // Note: NVVMIntrRangePass was causing numerical discrepancies at one
         // point, if issues crop up, consider disabling.
         FPM.addPass(NVVMIntrRangePass());
diff --git a/llvm/lib/Target/NVPTX/NVVMReflect.cpp b/llvm/lib/Target/NVPTX/NVVMReflect.cpp
index 56525a1edc7614..a0e897584a9d32 100644
--- a/llvm/lib/Target/NVPTX/NVVMReflect.cpp
+++ b/llvm/lib/Target/NVPTX/NVVMReflect.cpp
@@ -21,6 +21,7 @@
 #include "NVPTX.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Function.h"
@@ -219,7 +220,12 @@ bool NVVMReflect::runOnFunction(Function &F) {
   return runNVVMReflect(F, SmVersion);
 }
 
-NVVMReflectPass::NVVMReflectPass() : NVVMReflectPass(0) {}
+NVVMReflectPass::NVVMReflectPass() {
+  // Get the CPU string from the command line if not provided.
+  StringRef SM = codegen::getMCPU();
+  if (!SM.consume_front("sm_") || SM.consumeInteger(10, SmVersion))
+    SmVersion = 0;
+}
 
 PreservedAnalyses NVVMReflectPass::run(Function &F,
                                        FunctionAnalysisManager &AM) {
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
index ac5875c6ab1043..83cb3cde48de18 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
@@ -1,9 +1,9 @@
 ; Libdevice in recent CUDA versions relies on __CUDA_ARCH reflecting GPU type.
 ; Verify that __nvvm_reflect() is replaced with an appropriate value.
 ;
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_20 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM20
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_35 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_35 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM35
 
 @"$str" = private addrspace(1) constant [12 x i8] c"__CUDA_ARCH\00"
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
index 9d383218dce86a..bf8d6e2cca3071 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
@@ -1,8 +1,8 @@
 ; Verify that __nvvm_reflect_ocl() is replaced with an appropriate value
 ;
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_20 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM20
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_35 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_35 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM35
 
 @"$str" = private addrspace(4) constant [12 x i8] c"__CUDA_ARCH\00"
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
index 46ab79d9858cad..19c74df3037028 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
@@ -3,12 +3,12 @@
 
 ; RUN: cat %s > %t.noftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 0}' >> %t.noftz
-; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_0 --check-prefix=CHECK
 
 ; RUN: cat %s > %t.ftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}' >> %t.ftz
-; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_1 --check-prefix=CHECK
 
 @str = private unnamed_addr addrspace(4) constant [11 x i8] c"__CUDA_FTZ\00"
@@ -43,7 +43,7 @@ exit:
 
 declare i32 @llvm.nvvm.reflect(ptr)
 
-; CHECK-LABEL: define noundef i32 @intrinsic
+; CHECK-LABEL: define i32 @intrinsic
 define i32 @intrinsic() {
 ; CHECK-NOT: call i32 @llvm.nvvm.reflect
 ; USE_FTZ_0: ret i32 0
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
index 2ed9f7c11bcf9b..244b44fea9b83c 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
@@ -3,12 +3,12 @@
 
 ; RUN: cat %s > %t.noftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 0}' >> %t.noftz
-; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_0 --check-prefix=CHECK
 
 ; RUN: cat %s > %t.ftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}' >> %t.ftz
-; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_1 --check-prefix=CHECK
 
 @str = private unnamed_addr addrspace(4) constant [11 x i8] c"__CUDA_FTZ\00"
@@ -43,7 +43,8 @@ exit:
 
 declare i32 @llvm.nvvm.reflect(ptr)
 
-; CHECK-LABEL: define noundef i32 @intrinsic
+; CHECK-LABEL: define i32 @intrinsic
+
 define i32 @intrinsic() {
 ; CHECK-NOT: call i32 @llvm.nvvm.reflect
 ; USE_FTZ_0: ret i32 0

@Artem-B
Copy link
Member

Artem-B commented Jan 6, 2025

The problem is that libdevice depends on this patch and it does carry a fair amount of code that will no longer benefit from removal of unused conditional branches.
The way libdevice is used in CUDA, the intent was to process conditional bitcode early.
If OpenMP wants to do it differently, I would prefer to make it a special case, and keep the early reflect pass for CUDA.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 6, 2025

The problem is that libdevice depends on this patch and it does carry a fair amount of code that will no longer benefit from removal of unused conditional branches. The way libdevice is used in CUDA, the intent was to process conditional bitcode early. If OpenMP wants to do it differently, I would prefer to make it a special case, and keep the early reflect pass for CUDA.

I don't think this will make a considerable difference, since it's usually guarding some very shallow code paths. We still get full optimizations when the backend runs. If you think this is a major issue, I could acquiesce to making the non-backend version skip lowering if SmVersion is not set, but I think that this is cleaner.

$ clang foo.c --target=nvptx64-nvidia-cuda -flto -c -O2 // Used to run here
$ clang foo.bc --target=nvptx64-nvidia-cuda -O2 // Now only runs here

@jtramm
Copy link

jtramm commented Jan 6, 2025

With this PR, OpenMC works again with NVIDIA cards (OpenMC has been broken on nvidia since #119091).

@jdoerfert
Copy link
Member

FWIW, the pass should be super cheap, if it starts by looking for the intrinsic and then the uses. Running it twice is a valid option.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 6, 2025

FWIW, the pass should be super cheap, if it starts by looking for the intrinsic and then the uses. Running it twice is a valid option.

First pass should delete all the uses, the concern is that by not trimming the intrinsic earlier we're losing some optimizations, but I feel like we're still getting a full optimization pipeline and this likely will be easily optimized out. If @Artem-B is really concerned I'll just change it to keep the per-file run but ignore it if there's no SM passed.

@Artem-B
Copy link
Member

Artem-B commented Jan 6, 2025

Running it twice is a valid option.

The problem, IIUIC, is that in some compilation modes we may run optimization w/o the constants set properly for the reflect pass and running it may pick the wrong branch -- something that the late reflect pass would not be able to undo.

I don't think this will make a considerable difference, since it's usually guarding some very shallow code paths.

I don't think it's always the case. There are functions in libdevice where __nvvm_reflect() is used multiple times (i.e. not just a single top-level if).

I feel like we're still getting a full optimization pipeline and this likely will be easily optimized out.

It's a maybe. Considering that __nvvm_reflect() only depends on a string, its branches may be optimizable, as long as they have no other __nvvm_reflect() calls in them. However, libdevice does have some functions where it's not the case.

The practical impact will likely be limited to the heavy functions with multiple __nvvm_reflect() calls in branches, which will potentially keep those branches hanging around and blocking optimizations throughout the normal optimization pipeline. Whether the limited optimizations in the back-end will be sufficient to produce good-enough code for those functions -- I do not know. It will likely be rare, but it's a small consolation for those folks who use those functions.

ignore it if there's no SM passed.

Reflect can be used with other parameters, so SM-only check alone is, generally speaking, not sufficient as an on/off switch.

I think the decision where the reflect pass should run should be tied to the earliest point where the reflect inputs get set. For CUDA, it's the beginning of the pipeline. For openMP and stand-alone compilation it's probably somewhere closer to the back-end (or wherever we may link in with libdevice, or do LTO, or other point where we finally know what we're actually compiling for).

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

I think the decision where the reflect pass should run should be tied to the earliest point where the reflect inputs get set. For CUDA, it's the beginning of the pipeline. For openMP and stand-alone compilation it's probably somewhere closer to the back-end (or wherever we may link in with libdevice, or do LTO, or other point where we finally know what we're actually compiling for).

Yeah, the point is to defer something until the backend knows what the actual target is. The optimizations that run on the initial compile are usually more generic so I wouldn't think this would fire until the backend.

@jdoerfert
Copy link
Member

ignore it if there's no SM passed.

Reflect can be used with other parameters, so SM-only check alone is, generally speaking, not sufficient as an on/off switch.

Can we make the pass at first only specialize what is known to be known, and later do the rest?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

Can we make the pass at first only specialize what is known to be known, and later do the rest?

What I was suggesting, but I think it makes more sense to just make this a backend thing. Only difference it makes is having a few branches live slightly longer, but I really don't think that it'll make a noticeable difference.

@arsenm
Copy link
Contributor

arsenm commented Jan 7, 2025

I don't think this belongs in the backend, or middle end optimization pipeline. It's really a job for whatever "frontend" is loading the bitcode for final code generation

Summary:
This pass lowers the `__nvvm_reflect` builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in `libc`.
@jdoerfert
Copy link
Member

I don't think this belongs in the backend, or middle end optimization pipeline. It's really a job for whatever "frontend" is loading the bitcode for final code generation

I get @jhuber6's point about target-specific specialization. There is a benefit if we could do more "library" code IR generation w/o specifying all target details. We kinda do that now, and it broke stuff, but the direction is good.

What is the downside of multiple specialization runs, with the earlier one(s) not specializing what they do not know for sure?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

I think I could probably make @Artem-B happy if I just forewent the early pass if the architecture is not known. That'd leave CUDA with identical behavior while allowing this kind of use where we only specify the target during the final link + backend stage.

@Artem-B
Copy link
Member

Artem-B commented Jan 7, 2025

That would work, too.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

That would work, too.

It's a little annoying for NVPTX because we just default to sm_30 in these cases, might need to invent some way to detect if it's not been passed.

@Artem-B
Copy link
Member

Artem-B commented Jan 7, 2025

Can you rely on the 'cuda' part of the triple instead?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

Here's my attempt, hopefully it's not too invasive. Eager to get this landed so OpenMC works again.

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


ParseSubtargetFeatures(TargetName, /*TuneCPU*/ TargetName, FS);
ParseSubtargetFeatures(CPU.empty() ? "sm_30" : CPU,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU.empty() ? "sm_30" : CPU -> getTargetName()

@@ -35,9 +35,10 @@ void NVPTXSubtarget::anchor() {}
NVPTXSubtarget &NVPTXSubtarget::initializeSubtargetDependencies(StringRef CPU,
StringRef FS) {
// Provide the default CPU if we don't have one.
TargetName = std::string(CPU.empty() ? "sm_30" : CPU);
TargetName = std::string(CPU);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could use a comment on why we may want to keep CPU empty in some cases.

Copy link

github-actions bot commented Jan 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jhuber6 jhuber6 merged commit 29b5c18 into llvm:main Jan 7, 2025
5 of 7 checks passed
@kazutakahirata
Copy link
Contributor

I just checked in e7a83fc to fix a warning from this PR.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 7, 2025

I just checked in e7a83fc to fix a warning from this PR.

Was in the process of doing that myself, thanks for fixing it so fast.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 7, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot11 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/5192

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85760 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll (38296 of 85760)
******************** TEST 'LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20    | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
 #0 0x0000b5fd5914ec6c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000b5fd59149440 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000b5fd59150280 SignalHandler(int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000f2d1577c98f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000b5fd5906af80 SigTrap<(__hwasan::ErrorAction)1, (__hwasan::AccessType)0> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_checks.h:107:3
 #5 0x0000b5fd5906af80 MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common_interceptors.inc:847:7
 #6 0x0000b5fd600b40f8 consume_front /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:636:11
 #7 0x0000b5fd600b40f8 llvm::NVVMReflectPass::NVVMReflectPass() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp:226:11
 #8 0x0000b5fd60099478 addPass<llvm::NVVMReflectPass> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/IR/PassManager.h:201:9
 #9 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVPTXPassRegistry.def:39:1
#10 0x0000b5fd60099478 __invoke<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:149:25
#11 0x0000b5fd60099478 __call<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:216:12
#12 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:169:12
#13 0x0000b5fd60099478 std::__1::__function::__func<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3, std::__1::allocator<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3>, bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>::operator()(llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:314:10
#14 0x0000b5fd5dca5b40 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:990:3
#15 0x0000b5fd5dca5b40 bool callbacksAcceptPassName<llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>>(llvm::StringRef, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1338:11
#16 0x0000b5fd5db13390 llvm::PassBuilder::parsePassPipeline(llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>&, llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:2173:16
#17 0x0000b5fd5c65dc98 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:279:42
#18 0x0000b5fd5c65dc98 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:242:16
#19 0x0000b5fd5c65dc98 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::__1::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:478:14
#20 0x0000b5fd590b6830 __is_long /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1892:23
#21 0x0000b5fd590b6830 ~basic_string /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1228:9
#22 0x0000b5fd590b6830 optMain /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/optdriver.cpp:747:3
#23 0x0000f2d1570684c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
#24 0x0000f2d157068598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
#25 0x0000b5fd5905ae70 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt+0x57fae70)
==289873==ERROR: HWAddressSanitizer: tag-mismatch on address 0xffffd31298a1 at pc 0xb5fd5906af80
Step 11 (stage2/hwasan check) failure: stage2/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85760 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll (38296 of 85760)
******************** TEST 'LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20    | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
 #0 0x0000b5fd5914ec6c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000b5fd59149440 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000b5fd59150280 SignalHandler(int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000f2d1577c98f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000b5fd5906af80 SigTrap<(__hwasan::ErrorAction)1, (__hwasan::AccessType)0> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_checks.h:107:3
 #5 0x0000b5fd5906af80 MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common_interceptors.inc:847:7
 #6 0x0000b5fd600b40f8 consume_front /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:636:11
 #7 0x0000b5fd600b40f8 llvm::NVVMReflectPass::NVVMReflectPass() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp:226:11
 #8 0x0000b5fd60099478 addPass<llvm::NVVMReflectPass> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/IR/PassManager.h:201:9
 #9 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVPTXPassRegistry.def:39:1
#10 0x0000b5fd60099478 __invoke<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:149:25
#11 0x0000b5fd60099478 __call<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:216:12
#12 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:169:12
#13 0x0000b5fd60099478 std::__1::__function::__func<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3, std::__1::allocator<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3>, bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>::operator()(llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:314:10
#14 0x0000b5fd5dca5b40 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:990:3
#15 0x0000b5fd5dca5b40 bool callbacksAcceptPassName<llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>>(llvm::StringRef, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1338:11
#16 0x0000b5fd5db13390 llvm::PassBuilder::parsePassPipeline(llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>&, llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:2173:16
#17 0x0000b5fd5c65dc98 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:279:42
#18 0x0000b5fd5c65dc98 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:242:16
#19 0x0000b5fd5c65dc98 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::__1::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:478:14
#20 0x0000b5fd590b6830 __is_long /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1892:23
#21 0x0000b5fd590b6830 ~basic_string /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1228:9
#22 0x0000b5fd590b6830 optMain /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/optdriver.cpp:747:3
#23 0x0000f2d1570684c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
#24 0x0000f2d157068598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
#25 0x0000b5fd5905ae70 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt+0x57fae70)
==289873==ERROR: HWAddressSanitizer: tag-mismatch on address 0xffffd31298a1 at pc 0xb5fd5906af80

@krzysz00
Copy link
Contributor

Hi

This PR is causing problems in some cases where LLVM is being used as a library and so where the command-line flags might not be set - that is, we've got another version of the asan failure above in https://github.com/iree-org/iree/actions/runs/12759006193/job/35562065903?pr=19683 as shown below.

From what I can tell, getMCPU() isn't meant to be how you get the CPU string out of a command-line tool, and this should be looked up from the target triple instead.

Would you be willing to fix this?

Failure log
 cd /__w/iree/iree/build-asan/runtime/src/iree/hal/drivers/cuda/cts && /__w/iree/iree/build-asan/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --compile-mode=hal-executable --iree-hal-target-backends=cuda /__w/iree/iree/runtime/src/iree/hal/cts/testdata/command_buffer_dispatch_constants_test.mlir -o cuda_command_buffer_dispatch_constants_test.bin --iree-hal-executable-object-search-path=\"/__w/iree/iree/build-asan\"
  iree-compile: /__w/iree/iree/third_party/llvm-project/llvm/lib/CodeGen/CommandFlags.cpp:58: std::string llvm::codegen::getMCPU(): Assertion `MCPUView && "RegisterCodeGenFlags not created."' failed.
  Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
  Stack dump:
  0.	Program arguments: /__w/iree/iree/build-asan/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --compile-mode=hal-executable --iree-hal-target-backends=cuda /__w/iree/iree/runtime/src/iree/hal/cts/testdata/command_buffer_dispatch_constants_test.mlir -o cuda_command_buffer_dispatch_constants_test.bin --iree-hal-executable-object-search-path=\"/__w/iree/iree/build-asan\"
   #0 0x000056203ea51ce6 ___interceptor_backtrace (/__w/iree/iree/build-asan/tools/iree-compile+0x72ce6)
   #1 0x00007fe6cfe8a681 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /__w/iree/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:800:13
   #2 0x00007fe6cfe85c08 llvm::sys::RunSignalHandlers() /__w/iree/iree/third_party/llvm-project/llvm/lib/Support/Signals.cpp:0:5
   #3 0x00007fe6cfe8b410 SignalHandler(int) /__w/iree/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
   #4 0x00007fe6cb152520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
   #5 0x00007fe6cb1a69fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc)
   #6 0x00007fe6cb152476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476)
   #7 0x00007fe6cb1387f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3)
   #8 0x00007fe6cb13871b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b)
   #9 0x00007fe6cb149e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
  #10 0x00007fe6e03f975f std::char_traits<char>::assign(char&, char const&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/char_traits.h:357:14
  #11 0x00007fe6e03f975f std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_set_length(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:225:2
  #12 0x00007fe6e03f975f void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.tcc:232:2
  #13 0x00007fe6e03f975f void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char*>(char*, char*, std::__false_type) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:255:11
  #14 0x00007fe6e03f975f void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:274:4
  #15 0x00007fe6e03f975f std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:459:9
  #16 0x00007fe6e03f975f llvm::codegen::getMCPU[abi:cxx11]() /__w/iree/iree/third_party/llvm-project/llvm/lib/CodeGen/CommandFlags.cpp:58:1
  #17 0x00007fe6dc860e84 llvm::NVVMReflectPass::NVVMReflectPass() /__w/iree/iree/third_party/llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp:226:3
  #18 0x00007fe6dc82c3c9 std::enable_if<!(std::is_same_v<llvm::NVVMReflectPass, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > >), void>::type llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >::addPass<llvm::NVVMReflectPass>(llvm::NVVMReflectPass&&) /__w/iree/iree/third_party/llvm-project/llvm/include/llvm/IR/PassManager.h:201:9
  #19 0x00007fe6dc82c3c9 llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2::operator()(llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>) const /__w/iree/iree/third_party/llvm-project/llvm/lib/Target/NVPTX/NVPTXPassRegistry.def:39:1
  #20 0x00007fe6dc82c3c9 bool std::__invoke_impl<bool, llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2&, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> >(std::__invoke_other, llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2&, llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
  #21 0x00007fe6dc82c3c9 std::enable_if<is_invocable_r_v<bool, llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2&, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> >, bool>::type std::__invoke_r<bool, llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2&, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> >(llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2&, llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114:9
  #22 0x00007fe6dc82c3c9 std::_Function_handler<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>), llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_2>::_M_invoke(std::_Any_data const&, llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:290:9
  #23 0x00007fe6e01f791f std::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>::operator()(llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>) const /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9
  #24 0x00007fe6e01f791f bool callbacksAcceptPassName<llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >, llvm::SmallVector<std::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u> >(llvm::StringRef, llvm::SmallVector<std::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> >&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>&) /__w/iree/iree/third_party/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1349:11
  #25 0x00007fe6e008da17 llvm::PassBuilder::parsePassPipeline(llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >&, llvm::StringRef) /__w/iree/iree/third_party/llvm-project/llvm/lib/Passes/PassBuilder.cpp:2184:16
  #26 0x00007fe6d35f69bd llvm::Error::getPtr() const /__w/iree/iree/third_party/llvm-project/llvm/include/llvm/Support/Error.h:282:12
  #27 0x00007fe6d35f69bd llvm::Error::operator bool() /__w/iree/iree/third_party/llvm-project/llvm/include/llvm/Support/Error.h:242:16
  #28 0x00007fe6d35f69bd mlir::iree_compiler::IREE::HAL::optimizeModule(llvm::Module&, llvm::TargetMachine&, std::array<int, 3ul> const&) /__w/iree/iree/compiler/plugins/target/CUDA/CUDATarget.cpp:359:7
  #29 0x00007fe6d35f69bd mlir::iree_compiler::IREE::HAL::CUDATargetBackend::serializeExecutable(mlir::iree_compiler::IREE::HAL::TargetBackend::SerializationOptions const&, mlir::iree_compiler::IREE::HAL::ExecutableVariantOp, mlir::OpBuilder&) /__w/iree/iree/compiler/plugins/target/CUDA/CUDATarget.cpp:613:7
  #30 0x00007fe6d4bf2334 mlir::iree_compiler::IREE::HAL::(anonymous namespace)::SerializeTargetExecutablesPass::runOnOperation() /__w/iree/iree/compiler/src/iree/compiler/Dialect/HAL/Transforms/SerializeExecutables.cpp:87:11
[...]

@jhuber6
Copy link
Contributor Author

jhuber6 commented Jan 16, 2025

Hi

This PR is causing problems in some cases where LLVM is being used as a library and so where the command-line flags might not be set - that is, we've got another version of the asan failure above in https://github.com/iree-org/iree/actions/runs/12759006193/job/35562065903?pr=19683 as shown below.

From what I can tell, getMCPU() isn't meant to be how you get the CPU string out of a command-line tool, and this should be looked up from the target triple instead.

Would you be willing to fix this?
Failure log

Oh I can probably remove that, I thought it only showed up when called from opt directly. It was only necessary for trying to get the tests to work... But honestly could just ignore that.

krzysz00 added a commit to iree-org/llvm-project that referenced this pull request Jan 16, 2025
… pipeline (llvm#121834)"

This reverts commit 29b5c18.

Breaks ASan build

Signed-off-by: Krzysztof Drewniak <[email protected]>
nirvedhmeshram pushed a commit to iree-org/llvm-project that referenced this pull request Jan 20, 2025
… pipeline (llvm#121834)"

This reverts commit 29b5c18.

Breaks ASan build

Signed-off-by: Krzysztof Drewniak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants