Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to query RISC-V ELF binaries #23

Merged
merged 4 commits into from
Jul 19, 2024
Merged

Conversation

jackdbd
Copy link
Contributor

@jackdbd jackdbd commented Jul 2, 2024

This PR addresses #22

RISC-V ELF binaries can now be analyzed like x86_64 ELF binaries.

Examples

Select the first 5 instructions in the entire ELF.

sqlelf examples/hello-riscv/exe --sql "SELECT * FROM elf_instructions LIMIT 5;"

List the sections of an ELF file and sort them by binary size, in descending order.

sqlelf examples/hello-riscv/exe --sql "SELECT * FROM elf_sections ORDER BY size DESC"

Find the 10 largest unexported symbols of the .bss section by binary size.

sqlelf examples/hello-riscv/exe --sql \
"SELECT 
  name,
  section,
  size,
  type,
  value
FROM elf_symbols
WHERE exported = FALSE
ORDER BY size DESC
LIMIT 10;"

@jackdbd
Copy link
Contributor Author

jackdbd commented Jul 2, 2024

I think it makes sense to remove 98d980e from this PR.

The reason I made that commit is because I struggled for a few hours to set up a development environment, until the combination of Nix flakes and poetry2nix saved the day.

However, LIEF (_lief.so to be precise) requires libstdc++.so.6, and I couldn't figure out how to make the linker find it.

 ldd _lief.so
        linux-vdso.so.1 (0x00007ffff7fc6000)
        libstdc++.so.6 => not found
        libm.so.6 => /nix/store/dbwp0scbb0rk78m636sb7cvycz8xzgyh-glibc-2.39-52/lib/libm.so.6 (0x00007ffff771d000)
        libc.so.6 => /nix/store/dbwp0scbb0rk78m636sb7cvycz8xzgyh-glibc-2.39-52/lib/libc.so.6 (0x00007ffff752f000)
        /nix/store/dbwp0scbb0rk78m636sb7cvycz8xzgyh-glibc-2.39-52/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fc8000)

The only solution I was able to find for making ldd happy was to set LD_LIBRARY_PATH to a location in my nix store.

@fzakaria
Copy link
Owner

fzakaria commented Jul 3, 2024

I'll take a look. Originally I had flake.nix (i'm +1 a lot into Nix) but using non-published wheels was pretty onerous for LIEF.
(They have a challenging build process).

Can we split out the Nix stuff into a separate PR and just focus on the risc-v please?

@@ -233,6 +238,8 @@ def mode(binary: lief_ext.Binary) -> int:
def arch(binary: lief_ext.Binary) -> int:
if binary.header.machine_type == lief.ELF.ARCH.x86_64:
return cast(int, capstone.CS_ARCH_X86)
elif binary.header.machine_type == lief.ELF.ARCH.RISCV:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change should really be what is in this file + the examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add the case for RISC-V 32 bit? I think that capstone and LIEF support it.

Copy link
Owner

@fzakaria fzakaria Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure if you can test it. :)

The Makefile examples are used in the GitHub CI via the Python tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried to run the tests locally but they failed because I am on NixOS and I don't have binaries like /bin/ls and /usr/bin/ruby.

The only "test" I did was executing the RISC-V ELF in QEMU and running a few queries on it.

@jackdbd
Copy link
Contributor Author

jackdbd commented Jul 3, 2024

Can we split out the Nix stuff into a separate PR and just focus on the risc-v please?

Yes, I completely agree.

Also, I don't know why pandas takes forever to compile (30+ minutes on my old ThinkPad L380). I get that it depends on numpy and scipy, but still...

Do you know a better way to let the linker find libstdc++.so.6?

@fzakaria
Copy link
Owner

fzakaria commented Jul 3, 2024

You can change those to run which first to find them I guess.
I should test this on my NixOS laptop

@fzakaria
Copy link
Owner

fzakaria commented Jul 7, 2024

Ping me when you think it's ready for a review.
Just clearing some of my notifications / emails.

@jackdbd
Copy link
Contributor Author

jackdbd commented Jul 18, 2024

@fzakaria I cleaned up the README for the RISC-V example. Tell me if it's ok for you.

I wanted to add a few scripts that could be run with poetry run, but then I realized this project is configured to use setuptools as its build system, not Poetry.

@fzakaria
Copy link
Owner

Finally got NixOS working enough to review this -- will try by EOD.

@@ -0,0 +1,41 @@
.section .text
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably intead of assembly could be cross-compiled C file with qemu to run.
This is fine though for now.

@fzakaria fzakaria merged commit 978b1ec into fzakaria:main Jul 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants