Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demangle Symbols in Debuggers (LLDB, GDB) #540

Open
miguelmartin75 opened this issue Nov 28, 2023 · 2 comments
Open

Demangle Symbols in Debuggers (LLDB, GDB) #540

miguelmartin75 opened this issue Nov 28, 2023 · 2 comments

Comments

@miguelmartin75
Copy link

Summary

Related issue which is closed: nim-lang/Nim#8596

  • Nim can be debugged with LLDB (or GDB)
  • Name mangling causes UX issues with debugging in LLDB and GDB by requiring you to refer to Nim symbols in their mangled form.
    • The suggested workaround is quite hacky. For variables, this requires you to print all local or global variables. Then you scan and find the variable in a GUI or Terminal output. For other symbols, such as breaking at a function call, I can see this being quite frustrating.
    • Preferably we would not have to refer to names as mangled, and e.g. could print x rather than print x_<nim-specific-mangle>
  • This is a common problem, as noted from the forums:

Description

Here are my findings from researching LLDB. I have not researched GDB. I thought I would post them here in case others wanted to implement/execute this or whether I have missed something in my proposed solution.

For LLDB, one needs to:

  1. (Required) Let LLDB know how to identify the mangling scheme & how to de-mangle a symbol
  2. (Optional) Implement a Language plugin for deeper LLDB integration

References:

From reading the source: a unique mangling scheme identifiable from others is needed along with code to de-mangle it. All mangling schemes used by other languages/compilers (C++/Itanium, C++/MSVC, D, Rust) use a prefix to classify how/from what compiler the name was mangled.

For Nim: identifying the mangling scheme/language from a mangled name is more complex. This is because Nim is compiled into a target language that uses an existing mangling scheme. If we had control over the binary or Debug Symbol output file (e.g. DWARF), I believe this would be easier, but again: since the target language's compiler is being used it is slightly more complex.

To solve this with today's standard Nim compiler, here are my researched steps:

  1. Contain/embed a unique constant identifier within each symbol to identify that this symbol was output from the Nim compiler. Modifications to be done here: https://github.com/nim-lang/Nim/blob/502a4486aeb8d0a5dcdf86540522d3dc16960536/compiler/ccgutils.nim#L71
    • This unfortunately would have a chance to overlap with identifiers that are used for C or C++ code in existing codebases. Unicode symbols would allow for rare conflicts but would require C99 or above
    • This probably requires an RFC and further discussion
  2. Modify LLDB:
    1. Modify the Mangle class
      1. Add mangling scheme enum entry for Nim here: https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Core/Mangled.h#L41-L48
      2. Classify if the symbol originates from the Nim compiler with the above knowledge: https://github.com/llvm/llvm-project/blob/main/lldb/source/Core/Mangled.cpp#L42-L79
        • Implementation seems to require one-level deep recursion
      3. Call & implement demangling code in C++
    • Getting this accepted to LLDB might be difficult (due to valid C/C++ identifiers). Perhaps a compiler option similar to Apple's LLDB (see here) or a run-time flag would be appropriate here (seems to require many modifications of LLDB, maybe LLVM folks know best here)
  3. (optional): implement a Language plugin. Why? Deeper integration with LLDB

Alternatives

Here are some alternatives I can think of, but will likely require more work:

  1. Modify the nim compiler to output the target assembly directly (or via LLVM), this is related to NIR
    • It would be likely be easier convincing the LLVM/LLDB team to merge the name de-mangling changes for Nim if it did not conflict with C/C++ symbols
  2. Write a debugger in Nim. Pros:
    • Would offer a chance to integrate with the compiler, i.e. to evaluate nimscript in the debugger or to modify the program at run-time / to provide a REPL similar to Swift
    • Reading & modifying the LLDB code is hard with all the OOP/abstraction

Examples

No response

Backwards Compatibility

My proposed solution will change the way the nim compiler mangles, but for backward compatibility: one could offer a flag to mangle the old way. Though I don't think this flag would be necessary: just re-compile your source if you want debugging support.

Links

Mangling & D:

LLDB codepointers:

Writing a debugger:

@ringabout ringabout transferred this issue from nim-lang/Nim Nov 28, 2023
@Zectbumo
Copy link

+1 Please let's write our own debugger.

@ire4ever1190
Copy link

Implementing for GDB would be similar process 1. Imo adding support to existing debuggers is better than writing our own since it means less maintenance and allows easy integration with existing tools

Footnotes

  1. https://sourceware.org/gdb/wiki/Internals%20Adding-a-Source-Language-to-GDB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants