Skip to content

Commit

Permalink
Make memory checksums work with syscallbuf ABORT_COMMITs by avoiding …
Browse files Browse the repository at this point in the history
…issues with `syscallbuf_record::ret` and `syscallbuf_hdr::notify_on_syscall_hook_exit`

For `syscallbuf_record::ret` we can just record an early update to the field so that it has the correct value immediately during replay.

For `syscallbuf_hdr::notify_on_syscall_hook_exit`, recording the write performed by`is_safe_to_deliver_signal` is really painful because that write can end up in unexpected places, e.g. a FLUSH_SYSCALLBUF event, where it gets mistaken for the syscall buffer data. We could add metadata to fix that but it's quite complicated and adds a little of overhead for practially no gain. Instead, let's just wipe that flag out when we sanitize the syscallbuf data for checksumming.

This fixes a very rare intermittent where a stray deschedule in a checksumming test would trigger an ABORT_COMMIT which triggered a checksum failure. I've added a deterministic test for ABORT_COMMIT with checksums.
  • Loading branch information
rocallahan committed Aug 29, 2023
1 parent ff9ac5d commit 368c7a2
Show file tree
Hide file tree
Showing 5 changed files with 29 additions and 10 deletions.
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1565,6 +1565,7 @@ set(TESTS_WITHOUT_PROGRAM
checkpoint_mmap_shared
checkpoint_prctl_name
checkpoint_simple
checksum_block_open
checksum_sanity_noclone
comm
cont_signal
Expand Down
14 changes: 6 additions & 8 deletions src/RecordSession.cc
Original file line number Diff line number Diff line change
Expand Up @@ -491,12 +491,8 @@ static void seccomp_trap_done(RecordTask* t) {
// It's safe to reset the syscall buffer now.
t->delay_syscallbuf_reset_for_seccomp_trap = false;

t->write_mem(REMOTE_PTR_FIELD(t->syscallbuf_child, failed_during_preparation),
(uint8_t)1);
uint8_t one = 1;
t->record_local(
REMOTE_PTR_FIELD(t->syscallbuf_child, failed_during_preparation), &one);

t->write_and_record(REMOTE_PTR_FIELD(t->syscallbuf_child, failed_during_preparation),
(uint8_t)1);
if (EV_DESCHED == t->ev().type()) {
// Desched processing will do the rest for us
return;
Expand Down Expand Up @@ -1100,8 +1096,10 @@ static void save_interrupted_syscall_ret_in_syscallbuf(RecordTask* t,
// Record storing the return value in the syscallbuf record, where
// we expect to find it during replay.
auto child_rec = t->next_syscallbuf_record();
int64_t ret = retval;
t->record_local(REMOTE_PTR_FIELD(child_rec, ret), &ret);
// Also store it there now so that our memory checksums are correct.
// It will be overwritten by the tracee's syscallbuf code.
t->write_and_record(REMOTE_PTR_FIELD(child_rec, ret),
static_cast<int64_t>(retval));
}

static bool is_in_privileged_syscall(RecordTask* t) {
Expand Down
7 changes: 7 additions & 0 deletions src/RecordTask.h
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,13 @@ class RecordTask final : public Task {
// exists
bool record_remote_by_local_map(remote_ptr<void> addr, size_t num_bytes);

template <typename T>
void write_and_record(remote_ptr<T> addr, const T& value, bool* ok = nullptr,
uint32_t flags = 0) {
write_mem(addr, value, ok, flags);
record_local(addr, &value, 1);
}

/**
* Save tracee data to the trace. |addr| is the address in
* the address space of this task.
Expand Down
5 changes: 5 additions & 0 deletions src/test/checksum_block_open.run
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
source `dirname $0`/util.sh
GLOBAL_OPTIONS="$GLOBAL_OPTIONS --checksum=on-all-events"
record block_open$bitness
replay
check EXIT-SUCCESS
12 changes: 10 additions & 2 deletions src/util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -552,10 +552,18 @@ static void iterate_checksums(Task* t, ChecksumMode mode,
* the "extra data" for the possibly one pending
* record.
*
* The deterministic region excludes the notify_on_syscall_hook_exit
* flag. This flag is written by is_safe_to_deliver_signal and
* that write can occur at a different event to where ReplaySession
* eventually sets it.
*
* So here, we set things up so that we only checksum
* the deterministic region. */
auto child_hdr = m.map.start().cast<struct syscallbuf_hdr>();
auto hdr = t->read_mem(child_hdr);
struct syscallbuf_hdr hdr;
ASSERT(t, mem.size() >= sizeof(hdr));
memcpy(&hdr, mem.data(), sizeof(hdr));
hdr.notify_on_syscall_hook_exit = 0;
memcpy(mem.data(), &hdr, sizeof(hdr));
mem.resize(sizeof(hdr) + hdr.num_rec_bytes +
sizeof(struct syscallbuf_record));
}
Expand Down

0 comments on commit 368c7a2

Please sign in to comment.