-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DWARF DW_tag_variable
dropped for PHINode in DILexicalBlock
#118883
Comments
There seems to be a bug upstream that drops debuginfo for PHINodes if they use a scope that is not used otherwise, so workaround this by using the Subprogram scope instead.
@llvm/issue-subscribers-debuginfo Author: Cody Tapscott (topolarity)
```llvm
define i32 @square(i32, i32) local_unnamed_addr #0 !dbg !4 {
%3 = alloca i32, align 4
store i32 %0, i32* %3, align 4
call void @llvm.dbg.declare(metadata i32* %3, metadata !12, metadata !DIExpression()), !dbg !9
call void @llvm.dbg.value(metadata i32 %1, metadata !13, metadata !DIExpression()), !dbg !9
%4 = icmp eq i32 %1, 0
call void @llvm.dbg.value(metadata i1 %4, metadata !10, metadata !DIExpression()), !dbg !9
br i1 %4, label %arg1, label %arg2
arg1: arg2: ret: declare void @llvm.dbg.value(metadata, metadata, metadata) #1 attributes #1 = { nofree nosync nounwind readnone speculatable willreturn } !llvm.dbg.cu = !{!2} !0 = !{i32 7, !"Dwarf Version", i32 5} !2 = distinct !DICompileUnit(language: DW_LANG_C99, file: !3, producer: "test", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
The debug information for "result" is missing It seems to be related to the fact that the PHINode is the only user of the DILexicalBlock. It works if you either:
@@ -19,5 +19,5 @@
ret:
%8 = phi i32 [%6, %arg1], [%7, %arg2], !dbg !9
tail call void @<!-- -->llvm.dbg.value(metadata i32 %8, metadata !7, metadata !DIExpression()), !dbg !15
- ret i32 %8
+ ret i32 %8, !dbg !15
}
@@ -38,7 +38,7 @@
!4 = distinct !DISubprogram(name: "square", linkageName: "square", scope: !3, file: !3, line: 1, type: !5, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !2, retainedNodes: !16)
!5 = !DISubroutineType(types: !6)
!6 = !{!8, !8, !8}
-!7 = !DILocalVariable(name: "result", scope: !14, file: !3, line: 14, type: !8)
+!7 = !DILocalVariable(name: "result", scope: !4, file: !3, line: 14, type: !8)
!8 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_unsigned)
!9 = !DILocation(line: 2, column: 1, scope: !4)
!10 = !DILocalVariable(name: "branch", scope: !4, file: !3, line: 14, type: !11)
@@ -46,5 +46,5 @@
!12 = !DILocalVariable(name: "x", arg: 1, scope: !4, file: !3, line: 1, type: !8)
!13 = !DILocalVariable(name: "y", arg: 2, scope: !4, file: !3, line: 1, type: !8)
!14 = distinct !DILexicalBlock(scope: !4, file: !3, line: 100, column: 5)
-!15 = !DILocation(line: 100, column: 50, scope: !14)
+!15 = !DILocation(line: 100, column: 50, scope: !4)
!16 = !{} Identified on LLVM 18.1.7 |
Seems to also affect trunk: https://godbolt.org/z/jzfx18zj1 |
Thanks for the report -- I believe this is technically LLVM working by design, and the missing variable is a victim of optimisations. There are no instructions remaining that are in the lexical scope (the dbg.value is not a "real" instruction, I believe the PHIs DILocation of This is desirable because it means LLVM is able to avoid emitting debugging information for code portions that has been completely optimised out, which saves us file size. |
Hmm, even if you make the PHI's DILocation |
Also to clarify:
That's not true here. If you change the scope so that things are not dropped, you can build a test program and observe the variable in $ cat main.c
#include "stdint.h"
#include "stdio.h"
int32_t square(int32_t a, int32_t b);
int main( void ) {
int32_t result = square(2, 3);
fprintf(stderr, "Got: %d\n", result);
}
$ clang main.c a.o -o main
$ ./main
Got: 9
$ gdb -q main
Reading symbols from main...
(gdb) b square
Breakpoint 1 at 0x1190: file square.ll, line 3.
(gdb) r
Starting program: /home/topolarity/repos/julia/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, square (x=..., y=...) at square.ll:3
3 square.ll: No such file or directory.
(gdb) si 8
0x00005555555551b9 in square (x=..., y=...) at square.ll:2
2 in square.ll
(gdb) p result
$1 = 9 (the code generated is identical in both cases, the only difference is the debuginfo) |
* Use `llvm.dbg.value` instead of `llvm.dbg.declare` for SSA values Also insert the debug instrinsic just after the instruction that defines it. Technically these are not addresses in memory, so it's important to use `dbg.value` instead of `dbg.declare`. Without this change, almost none of the local SSA values are available in gdb (it might be just the arguments that are) but with it I can very often print the arguments of the instruction execution is halted on. * Workaround llvm/llvm-project#118883 for PHINodes There seems to be a bug upstream that drops debuginfo for PHINodes if they use a scope that is not used otherwise, so workaround this by using the Subprogram scope instead.
Indeed -- the PHI is not a real instruction, during codegen it's lowered to an implicit joining of values when control-flow merges. In some scenarios constants get materialised from PHIs into parent blocks, but that comes with other difficulties, and isn't the case in this reproducer.
Could you elaborate on changing the scope -- if you modify the LLVM-IR in your reproducer then it's technically a different program, perhaps one where keeping the variable present makes sense, but with the given reproducer LLVM should drop the variable as there's no use-case for preserving it. Additionally, how did you generate the IR in the reproducer -- debugging intrinsics haven't had non-zero line numbers for several releases, although it's possible they can leak in from adjacent instructions. |
Yes, I just mean the change from You can change the scope here to It might be that the scoping rules in LLVM mean I've accidentally implied this dies early, but I'm having trouble seeing why. I would have expected this scope to still be alive since the SSA value using it is alive and dominating and has a later use, with no intervening
I wrote this by hand based on some IR from an older version of LLVM |
By moving the variable into the !4 scope, you've moved it into a scope that is live (it has instructions in it - it's the function itself, not a nested scope (& not an empty nested scope, especially)) - so it's to be expected that that makes the variable live/emitted. You mention the phi - perhaps a different way to look at it that might be more clear: In the resulting assembly/machine code, which instructions do you expect to be attributed to the nested scope (!14)? This isn't really about the variable, but about why the scope is empty/whether the scope should be non-empty, I think. That might make the conversation clearer. |
Thanks, that's very helpful yeah
My expectation would be for a debug scope to be live if there are instructions where its debug information is relevant, which for This is a case that might be worth looking at: https://godbolt.org/z/YMEshTMxM The machine instruction associated with the block (the You can see that by trying |
I think I realized where I got myself confused here. DWARF expects that a variable shouldn't be available for use outside of its block, so the I am still surprised that in the original example |
Indeed @ variables only being available inside their scopes. Unfortunately I don't think we can warn on variable-locations being present when there are no instructions remaining for that scope, as we can lose source-locations (and thus scopes) pretty much anywhere in the compiler during optimisation. We'd end up needing to perform debug-info maintenance very frequently (costing lots of compile-time). Alas, that's a design decision for efficiency that produces this misleading scenario as a cost.
I reckon it's more accidental than intentional, as we simply throw away all source-location information for PHIs when they're lowered during codegen. IIRC there have been a lot of difficult questions in the past about whether the source-location for a PHI should mean anything at all, because they're an artifact of LLVMs SSA representation rather than being a "true" part of the source code. I would imagine that better-preserving the original source location of the |
Yeah, I wouldn't expect the PHI itself to have that labeled debug information necessarily. Especially the emitted code for each incoming edge seems best to have implicitly become part of the predecessor block like it does already. I was mostly thinking that it should be propagated forward, based on the docs saying that "the DWARF generator defaults to allowing the last-set location after a label to cascade forward" - but that's a small quirk either way Thanks much for setting my thinking straight and walking me through the details here - I'm going to close this as resolved |
Checking with:
The debug information for "result" is missing
It seems to be related to the fact that the PHINode is the only user of the DILexicalBlock. It works if you either:
DILexicalBlock
in more locations:DISubprogram
as the scope ofresult
:Identified on LLVM 18.1.7
The text was updated successfully, but these errors were encountered: