Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to query RISC-V ELF binaries #23

Merged
merged 4 commits into from
Jul 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions examples/hello-riscv/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Hello World in RISC-V

This example shows how to assemble, link and query a simple Hello World program for RISC-V.

If you want to run this example, you will need a cross-compiling toolchain like the [RISC-V GNU Compiler Toolchain](https://github.com/riscv-collab/riscv-gnu-toolchain) and [QEMU](https://www.qemu.org/).

## 32-bit

Assemble `hello.s` into an object file for the RISC-V 32-bit base integer instruction set (`-march rv32i`), little-endian (`-mlittle-endian`), with an ABI that follows the convention where `int`, `long` and `pointer` types are all 32-bit, with debug symbols included in the object file (`-g`):

```sh
riscv64-elf-as -march rv32i -mabi ilp32 -mlittle-endian -o hello.o hello.s -g
```

Link the object file into a RISC-V 32-bit little-endian executable (`-m elf32lriscv`), with the symbol `_start` as its entry point:

```sh
riscv64-elf-ld -e _start -m elf32lriscv -o exe --verbose hello.o
```

Execute the RISC-V ELF in QEMU:

```sh
qemu-riscv32 exe
```

Double check the disassembly:

```sh
riscv64-elf-objdump --disassemble exe
```

## 64-bit

Assemble `hello.s` into an object file for the RISC-V 64-bit base integer instruction set (`-march rv64i`), little-endian (`-mlittle-endian`), with an ABI that follows the convention where `long` and `pointer` types are all 64-bit, with debug symbols included in the object file (`-g`):

```sh
riscv64-elf-as -march rv64i -mabi lp64 -mlittle-endian -o hello.o hello.s -g
```

Link the object file into a RISC-V 64-bit little-endian executable (`-m elf64lriscv`), with the symbol `_start` as its entry point:

```sh
riscv64-elf-ld -e _start -m elf64lriscv -o exe --verbose hello.o
```

Execute the RISC-V ELF in QEMU:

```sh
qemu-riscv64 exe
```

Double check the disassembly:

```sh
riscv64-elf-objdump --disassemble exe
```
41 changes: 41 additions & 0 deletions examples/hello-riscv/hello.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
.section .text
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably intead of assembly could be cross-compiled C file with qemu to run.
This is fine though for now.

.globl _start
.equ STDOUT, 1 # File descriptor 1 is standard output (stdout)
.equ WRITE, 64 # Linux write syscall
.equ EXIT, 93 # Linux exit syscall
.equ EXIT_CODE_SUCCESS, 0

_start:
# In C, a list of parameters is passed to the kernel in a certain sequence.
# For the write system call, the parameters are structured as follows:
# ssize_t write(int fd, const void *buf, size_t count)
# The three parameters passed are:
# 1. a file descriptor (e.g. 1 for stdout)
# 2. a pointer to a character buffer (i.e. a string)
# 3. the number of characters in that string to be written.
li a0, STDOUT
la a1, buf_begin
# Load a byte from memory, zero-pad it (to a 64-bit value in RV64), and store
# the unsigned value in the destination register a2.
lbu a2, buf_size

# Store the system call number in register a7.
li a7, WRITE
# Switch to RISC-V supervisor mode (the Linux kernel runs in this mode) and
# make a request using the value stored in a7 as the system call number.
ecall

li a0, EXIT_CODE_SUCCESS
li a7, EXIT
ecall

# The .rodata section of an ELF binary contains constant values. The .rodata
# section is marked as read-only, so these values cannot change at runtime.
.section .rodata

buf_begin:
.string "Hello World!\n"
buf_size:
# Current address (the .) minus address of buf_begin = length of buffer.
# We store the result in a 8-bit word using the .byte directive.
.byte .-buf_begin
2 changes: 2 additions & 0 deletions flake.nix
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@
with nixpkgsFor.${system}; {
default = mkShellNoCC {
venvDir = "./.venv";
# needed for tests
TEST_BINARY = "${coreutils}/bin/ls";
packages = [
python3Packages.pip
# This execute some shell code to initialize a venv in $venvDir before
Expand Down
15 changes: 13 additions & 2 deletions sqlelf/elf.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,14 +225,25 @@ def instructions_generator() -> Iterator[dict[str, Any]]:


def mode(binary: lief_ext.Binary) -> int:
if binary.header.identity_class == lief.ELF.ELF_CLASS.CLASS64:
return cast(int, capstone.CS_MODE_64)
machine_type = binary.header.machine_type
identity_class = binary.header.identity_class
if machine_type == lief.ELF.ARCH.RISCV:
if identity_class == lief.ELF.ELF_CLASS.CLASS32:
return cast(int, capstone.CS_MODE_RISCV32)
if machine_type == lief.ELF.ARCH.RISCV:
if identity_class == lief.ELF.ELF_CLASS.CLASS64:
return cast(int, capstone.CS_MODE_RISCV64)
if machine_type == lief.ELF.ARCH.x86_64:
if identity_class == lief.ELF.ELF_CLASS.CLASS64:
return cast(int, capstone.CS_MODE_64)
raise RuntimeError(f"Unknown mode for {binary.path}")


def arch(binary: lief_ext.Binary) -> int:
if binary.header.machine_type == lief.ELF.ARCH.x86_64:
return cast(int, capstone.CS_ARCH_X86)
elif binary.header.machine_type == lief.ELF.ARCH.RISCV:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change should really be what is in this file + the examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add the case for RISC-V 32 bit? I think that capstone and LIEF support it.

Copy link
Owner

@fzakaria fzakaria Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure if you can test it. :)

The Makefile examples are used in the GitHub CI via the Python tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried to run the tests locally but they failed because I am on NixOS and I don't have binaries like /bin/ls and /usr/bin/ruby.

The only "test" I did was executing the RISC-V ELF in QEMU and running a few queries on it.

return cast(int, capstone.CS_ARCH_RISCV)
raise RuntimeError(f"Unknown machine type for {binary.path}")


Expand Down
Loading