You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to also compute the mapping between the LLM output and the input from Ghidra decompiler as a line map? Something like LLM_OUT_LINES[line_number] = {one or more line numbers from the Ghidra input}.
In your Colab example, the output line: if (fabs(a[i] - a[j]) < eps)
I'm not sure if something like this can be done with LLMs at all. If doable though, then this project would be really useful for tools like profilers, where one could mark the source lines where most time is spent by mapping assembly instructions to lines with the help of debug info.
The text was updated successfully, but these errors were encountered:
Aligning the input and output of a large language model isn't achievable unless we tailor the training process (similar to how objdump -d -S pairs one line of source code with a few lines of assembly). We plan to explore this line-by-line training approach (asm-src, not ghidra) in future updates for a more versatile chat model, which might take a few months to develop, but we hope it will be beneficial.
We've also observed that a group of smart researchers have done some work which may help your situation; you might want to explore their models.
Hi,
Would it be possible to also compute the mapping between the LLM output and the input from Ghidra decompiler as a line map? Something like LLM_OUT_LINES[line_number] = {one or more line numbers from the Ghidra input}.
In your Colab example, the output line:
if (fabs(a[i] - a[j]) < eps)
would be mapped to the 3 input lines:
I'm not sure if something like this can be done with LLMs at all. If doable though, then this project would be really useful for tools like profilers, where one could mark the source lines where most time is spent by mapping assembly instructions to lines with the help of debug info.
The text was updated successfully, but these errors were encountered: