Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid acquiring state lock in Command.ServeHTTP #366

Closed
benhoyt opened this issue Feb 22, 2024 · 1 comment · Fixed by #493
Closed

Avoid acquiring state lock in Command.ServeHTTP #366

benhoyt opened this issue Feb 22, 2024 · 1 comment · Fixed by #493

Comments

@benhoyt
Copy link
Contributor

benhoyt commented Feb 22, 2024

For every request that comes in, when sending the response in Command.ServeHTTP, we acquire and release the state lock (twice!). We should avoid this if possible, at the very least for the GET /v1/health endpoint, which needs to return quickly and avoid global locks even when the system is under heavy load.

This came up in https://bugs.launchpad.net/juju/+bug/2052517, where a charm is using Pebble exec (which uses the state lock) and then a MySQL backup is occurring which puts the system under heavy CPU load, so occasionally the health endpoint (even with no health checks defined) is taking longer than 1s to respond, which causes K8s to terminate the pod due to how we have the probes set up (with a 1s timeout and only 1 attempt): https://github.com/juju/juju/blob/a90328ccf574d4be960b0236a5c21cd030111fd2/caas/kubernetes/provider/application/application.go#L59-L68

So if we can avoid needing the state lock in ServeHTTP in general, that would be best. If not, we should at least figure out a way to avoid it in GET /v1/health.

@benhoyt
Copy link
Contributor Author

benhoyt commented Mar 13, 2024

We're now avoiding hitting acquiring the state lock in the health check endpoint (#369), but not the other endpoints. We should consider this as part of the lock improvements work in 24.10. It's a bit annoying that we acquire the lock (twice no less) at the end of every request for two features Pebble doesn't actually use (maintenance and warnings), so we should at least consider if we can drop those.

benhoyt added a commit to benhoyt/pebble that referenced this issue Jun 26, 2024
In daemon.Command.ServeHTTP, at the end of every request we acquire the
state lock twice, once for including the maintenance (restart) info in
the response, and once for including the warnings summary. We don't use
either maintenance/restarts or warnings in Pebble at all, so let's just
turn that off.

We could rip out those features completely, but the APIs need to stay
for backwards compatibility, and it's possible we'll want them later,
so let's just disable them and their tests for now.

Fixes canonical#366
benhoyt added a commit that referenced this issue Jul 17, 2024
This change avoids the need to lock/unlock state when fetching
`restart.Pending()`, avoiding the need for one state lock/unlock on
every HTTP request. (We hope to get rid of the other lock/unlock by
removing warnings, but that's a separate issue.)

Updates #366
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant