Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing s390x with RUST_BACKTRACE=1 in QEMU crashes #9719

Closed
alexcrichton opened this issue Dec 3, 2024 · 3 comments · Fixed by #9725
Closed

Testing s390x with RUST_BACKTRACE=1 in QEMU crashes #9719

alexcrichton opened this issue Dec 3, 2024 · 3 comments · Fixed by #9725
Labels
cranelift:area:s390x Issues related to Cranelift's s390x backend

Comments

@alexcrichton
Copy link
Member

In landing #9702 I was wrestling with a s390x-specific failure on CI. The problem seems to stem from RUST_BACKTRACE=1 and using std::backtrace which an updated version of anyhow uses. This code is now all landed on main so the current main branch of Wasmtime fails with:

$ export RUST_BACKTRACE=1 
$ export CARGO_PROFILE_DEV_OPT_LEVEL=2 
$ cargo test -p wasmtime-wasi --target s390x-unknown-linux-gnu -p wasmtime-wasi
...
    Finished `test` profile [optimized + debuginfo] target(s) in 56.55s
     Running unittests src/lib.rs (target/s390x-unknown-linux-gnu/debug/deps/wasmtime_wasi-e36f6fb1833db8d7)

running 15 tests
test host::filesystem::test::table_readdir_works ... ok
test stdio::test::memory_stdin_stream ... ok
test random::test::deterministic ... ok
test stdio::test::async_stdout_stream_unblocks ... ok
test pipe::test::backpressure_read_stream ... ok
test stdio::test::async_stdin_stream ... ok
test pipe::test::infinite_read_stream ... ok
test pipe::test::finite_read_stream ... ok
test pipe::test::empty_read_stream ... ok
test pipe::test::sink_write_stream ... ok
test pipe::test::closed_write_stream ... ok
test pipe::test::multiple_chunks_write_stream ... ok
test pipe::test::multiple_chunks_read_stream ... ok
test pipe::test::backpressure_write_stream ... ok
test pipe::test::backpressure_write_stream_with_flush ... ok

test result: ok. 15 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.36s

     Running tests/all/main.rs (target/s390x-unknown-linux-gnu/debug/deps/all-d771c2793612b780)

running 196 tests
test async_::preview1_clock_time_get ... ok
test async_::preview1_fd_filestat_get ... ok
test api::api_time ... ok
test api::api_reactor ... ok
test async_::preview1_big_random_buf ... ok
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
error: test failed, to rerun pass `-p wasmtime-wasi --test all`

Caused by:
  process didn't exit successfully: `qemu-s390x -L /usr/s390x-linux-gnu -E LD_LIBRARY_PATH=/usr/s390x-linux-gnu/lib -E WASMTIME_TEST_NO_HOG_MEMORY=1 /home/alex/code/wasmtime/target/s390x-unknown-linux-gnu/debug/deps/all-d771c2793612b780` (signal: 11, SIGSEGV: invalid memory reference)

@uweigand would you be able to help take a closer look at this? I'm not sure if this is a Wasmtime/jit code issue (maybe unwind info?) or something else in rustc perhaps

@alexcrichton alexcrichton added the cranelift:area:s390x Issues related to Cranelift's s390x backend label Dec 3, 2024
@uweigand
Copy link
Member

uweigand commented Dec 3, 2024

Good news is I can reproduce natively, so it's not a qemu issue. The segfault happens in MD_FALLBACK_FRAME_STATE_FOR in libgcc, because of an invalid PC. This typically indicates some problem with unwind info in a lower frame. And indeed GDB also isn't able to unwind fully. I'll need to look where this comes from.

@uweigand
Copy link
Member

uweigand commented Dec 4, 2024

This is a regression introduced with the tail-call ABI. If we have incoming tail-call stack arguments, the DWARF rule to unwind the caller's SP is incorrect. I'm working on a fix.

uweigand added a commit to uweigand/wasmtime that referenced this issue Dec 4, 2024
On s390x, the unwound SP is always at current CFA - 160.  Therefore,
the default rule used on most other platforms (which sets the
unwound SP to the current CFA) is incorrect, so we need to provide
an explicit DWARF CFI rule to unwind SP.

With the platform ABI, the caller's SP is always stored in the
register save area like other call-saved GPRs, so we can simply
use a normal DW_CFA_offset rule.  However, with the new tail-call
ABI, the value saved in that slot is incorrect - it is not
corrected for the incoming tail-call stack arguments that will
have been removed as the tail call returns.

To fix this without introducing unnecessary run-time overhead,
we can simply use a DW_CFA_val_offset rule that will set the
unwound SP to CFA - 160, which is always correct.  However, the
current UnwindInst abstraction does not allow any way to generate
this DWARF CFI instruction.  Therefore, we introduce a new
UnwindInst::RegStackOffset rule for this purpose.

Fixes: bytecodealliance#9719
@uweigand
Copy link
Member

uweigand commented Dec 4, 2024

The above PR fixes the issue for me.

github-merge-queue bot pushed a commit that referenced this issue Dec 4, 2024
On s390x, the unwound SP is always at current CFA - 160.  Therefore,
the default rule used on most other platforms (which sets the
unwound SP to the current CFA) is incorrect, so we need to provide
an explicit DWARF CFI rule to unwind SP.

With the platform ABI, the caller's SP is always stored in the
register save area like other call-saved GPRs, so we can simply
use a normal DW_CFA_offset rule.  However, with the new tail-call
ABI, the value saved in that slot is incorrect - it is not
corrected for the incoming tail-call stack arguments that will
have been removed as the tail call returns.

To fix this without introducing unnecessary run-time overhead,
we can simply use a DW_CFA_val_offset rule that will set the
unwound SP to CFA - 160, which is always correct.  However, the
current UnwindInst abstraction does not allow any way to generate
this DWARF CFI instruction.  Therefore, we introduce a new
UnwindInst::RegStackOffset rule for this purpose.

Fixes: #9719
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:s390x Issues related to Cranelift's s390x backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants