Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record that the sigframe write recorded is conservative #3800

Conversation

KJTsanaktsidis
Copy link
Contributor

When handling a signal in a tracee, the kernel will spill some data to the stack (the "sigframe"), and this data needs to be recorded in the trace for replaying. However, we don't actually know how much data the kernel will write. Instead, we take a conservative estimate which is guaranteed to be at least that size, and record that amount of memory.

However, that range might cross a memory mapping, and during replay, part of that range might not be mapped. Currently that raises an assertion error because the write is smaller than the read.

In this patch, we record whether a memory "write" is actually this kind of conservative capture, and mark it as such in the trace. During replay, we then do not fail that assertion.

Should fix #3779

@rocallahan
Copy link
Collaborator

I thought we were going to do your plan, "what if we just made the replay skip the assertion, if the write comes through here"? This looks like an implementation of my plan. I think your plan is better :-).

@KJTsanaktsidis
Copy link
Contributor Author

Oh! Had a closer look at the code when I started implementing today, and I actually thought this way was more elegant. Because the recording code writing the trace is the bit that knows that the memory range recorded is a guess and it may not all be clobbered by the kernel. That was my understanding anyway.

Happy to do it the other way though! I’ll make a different PR.

@rocallahan
Copy link
Collaborator

No, we can do it this way.

@@ -1724,7 +1724,7 @@ bool RecordTask::record_remote_by_local_map(remote_ptr<void> addr,
return false;
}

void RecordTask::record_remote(remote_ptr<void> addr, ssize_t num_bytes) {
void RecordTask::record_remote(remote_ptr<void> addr, ssize_t num_bytes, bool sizeIsConservative) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use snake case, so size_is_conservative.

But for public APIs it's better to use a two-value enum than a bool parameter --- more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't 100% sure what a good name for this would be. MemWriteSizeValidaiton seems a bit wordy, but there it is.

@@ -56,6 +56,7 @@ class TraceStream {
size_t size;
pid_t rec_tid;
std::vector<WriteHole> holes;
bool sizeIsConservative;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case

@@ -344,6 +347,7 @@ class TraceReader : public TraceStream {
std::vector<uint8_t> data;
remote_ptr<void> addr;
pid_t rec_tid;
bool sizeIsConservative;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snake case

@@ -354,6 +358,7 @@ class TraceReader : public TraceStream {
remote_ptr<void> addr;
pid_t rec_tid;
std::vector<WriteHole> holes;
bool sizeIsConservative;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case

raw_recs.push_back({ addr, total_len, rec_tid, holes });
const std::vector<WriteHole>& holes,
bool sizeIsConservative) {
raw_recs.push_back({ addr, total_len, rec_tid, holes, sizeIsConservative});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before }

@@ -573,7 +573,7 @@ TraceFrame TraceReader::read_frame(FrameTime skip_before) {
const auto& hole = holes[j];
h[j] = { hole.getOffset(), hole.getSize() };
}
raw_recs[i] = { w.getAddr(), (size_t)w.getSize(), i32_to_tid(w.getTid()), h };
raw_recs[i] = { w.getAddr(), (size_t)w.getSize(), i32_to_tid(w.getTid()), h, w.getSizeIsConservative()};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere there should be a call to setSizeIsConservative() but I think it's missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, whoops, yeah, this is very obviously not going to do anything useful without it.

When handling a signal in a tracee, the kernel will spill some data to
the stack (the "sigframe"), and this data needs to be recorded in the
trace for replaying. However, we don't actually know how much data the
kernel will write. Instead, we take a conservative estimate which is
guaranteed to be at least that size, and record that amount of memory.

However, that range might cross a memory mapping, and during replay,
part of that range might not be mapped. Currently that raises an
assertion error because the write is smaller than the read.

In this patch, we record whether a memory "write" is actually this kind
of conservative capture, and mark it as such in the trace. During
replay, we then do not fail that assertion.
@KJTsanaktsidis KJTsanaktsidis force-pushed the ktsanaktsidis/record_maybe_too_big branch from c9b1eee to 28be9f6 Compare August 27, 2024 11:03
@KJTsanaktsidis
Copy link
Contributor Author

I didn't use an enum instead of a bool in the capn proto file, since it seems we have bool fields in there.

I'd also love some pointers on how you think this should be tested. There are already test cases covering vfork, this is just a sometimes-problem...

@rocallahan
Copy link
Collaborator

There's no easy way to test this so I think we should just ignore that for now.

@rocallahan rocallahan merged commit 31c82c8 into rr-debugger:master Aug 27, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion `nwritten == buf_size' failed to hold (Again)
2 participants