Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow stopping services in "starting" state #503

Merged
merged 14 commits into from
Oct 10, 2024

Conversation

IronCore864
Copy link
Contributor

@IronCore864 IronCore864 commented Sep 24, 2024

Pebble has a 1-second okayWait time period for starting services, after which, if the service is still not exited, it's considered as "running". Previously, when services were still in the starting state within this 1-second okayWait period, they couldn't be stopped. This causes some issues in tests. And, just because services have just started doesn't mean they shouldn't be allowed to stop. This PR allows services in the starting state (within the 1s okayWait period) to be stopped, no matter if it's the daemon exits (Ctrl+C in the console) or the pebble stop command is issued.

Fixes #410
Fixes #502

Also adding two test cases:

  1. servstate: use a mock service which sleeps a while then create a side effect, stop the service immediately after it's started, and the side effect won't be created.
  2. daemon: start a service, stop the daemon before the okay delay, and a stop change should be created. Whether the service is actually terminated or not isn't tested because it's tested above in the servstate.

@IronCore864 IronCore864 changed the title [WIP] fix: allow stopping services that are starting and restart services that quit quickly. fix: allow stopping services that are starting and restart services that quit quickly Sep 26, 2024
@IronCore864 IronCore864 marked this pull request as ready for review September 26, 2024 05:49
@benhoyt benhoyt changed the title fix: allow stopping services that are starting and restart services that quit quickly fix: allow stopping services in "starting" state Sep 26, 2024
Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with the fix, thanks. However, I wonder if we can spend some time on test cases. Ideally we could have a test for each of the issues this will close:

  • If a service is stopped during okayWait, it 1) returns the "stopped before the okay delay" error, and 2) is actually terminated
  • If the daemon is stopped during okayWait, the service is terminated

internals/overlord/servstate/state-diagram.dot Outdated Show resolved Hide resolved
internals/overlord/servstate/handlers.go Outdated Show resolved Hide resolved
internals/daemon/daemon_test.go Outdated Show resolved Hide resolved
internals/daemon/daemon_test.go Outdated Show resolved Hide resolved
internals/overlord/servstate/manager_test.go Outdated Show resolved Hide resolved
internals/overlord/servstate/manager_test.go Outdated Show resolved Hide resolved
@benhoyt
Copy link
Contributor

benhoyt commented Oct 2, 2024

Also seems like the new daemon test is failing (sometimes). Let's ensure it's not flakey with some stress testing of the new tests.

@IronCore864
Copy link
Contributor Author

Vulnerability scanning failed, which is a known issue. Here is a workaround, shall we adopt it? I prepared a PR for this: #507.

Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test fix looks reasonable to me. Feel free to merge this PR (and we can discuss the TestDeadlock issue you found separately).

@IronCore864 IronCore864 merged commit acaacb2 into canonical:master Oct 10, 2024
18 checks passed
IronCore864 added a commit that referenced this pull request Oct 14, 2024
…rting state (#510)

In [this PR](#503), we
introduced a feature to allow stopping services that are in the starting
state / within the okayDelay. However, this makes [a deadlock issue in
ringbuffer a real
issue](#508). So we are
reverting this change now and will redo it after the deadlock is
resolved.
IronCore864 added a commit that referenced this pull request Oct 21, 2024
Previously, we introduced [a fix to allow stopping services in the
starting state](#503).

Because of this fix, we discovered [another deadlock
issue](#508), so we rolled it
back. Now that the deadlock issue is [fixed by this
PR](https://github.com/canonical/pebble/pull/511/files), we are
reintroducing the fix about "allowing stopping services in the starting
state".

Benchmark test, manual test to reproduce the 3-way deadlock issue, and
race test are all done and passed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants