Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another batch of seek-ticks fixes #3644

Merged
merged 4 commits into from
Nov 11, 2023

Conversation

dzaima
Copy link
Contributor

@dzaima dzaima commented Nov 7, 2023

This fixes two bugs:

  • seek-ticks to a tick exactly on a SCHED event goes to the other thread, not the current one (the >=< change);

  • seek-ticks to within the current event skips too far if the next event is for a different tid.

Additionally, this extracts the logic for tick→time conversion to a function (first commit doing nothing but that).

@dzaima
Copy link
Contributor Author

dzaima commented Nov 7, 2023

fails tests; will look at tomorrow. (no clue how to write tests for the bugs here)

@rocallahan
Copy link
Collaborator

(no clue how to write tests for the bugs here)

It should be possible to construct a program that does some context switching and use the bash .run script to run rr dump to find the right event number to run a test at.

@dzaima
Copy link
Contributor Author

dzaima commented Nov 8, 2023

Pushed some testing. For whatever reason, in a sequence of

global_time:* ticks:A tid:F
global_time:X ticks:B tid:G event:SCHED
global_time:* ticks:C tid:F

a run X results in thread F being active and the ticks A and C being the ticks to use, not B, which I don't know what to think about (and doesn't happen with at least some other event types).

The failed test is from the comparison change, previously changed the other direction here. I don't see any real reason for a seek-ticks to prefer a specific event number though, when all match the ticks, and currently I don't see a nice way to make it go the previous way while fixing the bug here.

@dzaima
Copy link
Contributor Author

dzaima commented Nov 8, 2023

Actually, even before these changes, seek-ticks 0 didn't consistently go to the first event, e.g. on mainline:

// initial status of a bin/tick0 recording:
(rr) when
Current event: 15
(rr) when-ticks
Current tick: 0

// thing currently tested by tick0:
(rr) c
Continuing.
EXIT-SUCCESS
(rr) seek-ticks 0
(rr) when
Current event: 15

// broken thing:
(rr) run 15
(rr) when
Current event: 16 // run 15 goes to when==16, whatever, regular rr weirdness
(rr) seek-ticks 0
(rr) when
Current event: 17 // seek-ticks 0 didn't go to 15!

whereas with my changes here it at least consistently always goes to 17:

(rr) when
Current event: 15
(rr) when-ticks
Current tick: 0

(rr) seek-ticks 0
(rr) when
Current event: 17

(rr) c
Continuing.
EXIT-SUCCESS
(rr) seek-ticks 0
(rr) when
Current event: 17

(rr) run 15
(rr) when
Current event: 16
(rr) seek-ticks 0
(rr) when
Current event: 17

@rocallahan
Copy link
Collaborator

I don't see any real reason for a seek-ticks to prefer a specific event number though, when all match the ticks, and currently I don't see a nice way to make it go the previous way while fixing the bug here.

I'm happy for you to change the existing test behaviour here.

@rocallahan
Copy link
Collaborator

regular rr weirdness

Yes, we should probably clean up the way event numbers work so that rr replay -g and run event numbers take you to the start of the execution of that event instead of the end.

@dzaima
Copy link
Contributor Author

dzaima commented Nov 9, 2023

Yes, we should probably clean up the way event numbers work so that rr replay -g and run event numbers take you to the start of the execution of that event instead of the end.

Is that what's happening though? Perhaps my understanding of events/ticks is even worse than I imagined (which is extremely possible), but I think a run n does go to the time where event n happened - a seek-ticks after it matches the ticks for event n in rr dump, and going from rr dump delta real_time values, I believe the mapping of its events is accurate too (i.e. if syscall X was set up to happen 3 seconds after the previous event, the line in rr dump with a real_time three seconds later than the previous one will be a syscall X ENTERING_SYSCALL event, running to whose global_time indeed goes to the syscall in question (and so does the following EXITING_SYSCALL event; ignoring the edge-case of SCHED from above)). I believe it's when that's wrongly reporting the next event (..or, I guess equally, everything other than when is reporting the event before the current one).

Also, didn't notice the seekticks_threads-32 fail on my push of a comment change... ouch. Don't think it's any of my changes being broken though, but just my new test hitting some existing issue (and/or getting very unlucky)? The failure is

[FATAL src/PerfCounters.cc:970:read_ticks()]
 (task 320807 (rec:307717) at time 198)
 -> Assertion `!counting_period || interrupt_val <= adjusted_counting_period' failed to hold. Detected 695896 ticks, expected no more than 689668

which is similar to #3614 (just a bit above the value; 32-bit).

Other than this, I did adapt the tick0 test to the new behavior, so this PR should be done

@rocallahan rocallahan merged commit f659017 into rr-debugger:master Nov 11, 2023
1 check passed
@rocallahan
Copy link
Collaborator

See #3674 3674

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants