-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman ps: fix racy pod name query #23325
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We were not able to find or create Copr project
Unless the HTTP status code above is >= 500, please check your configuration for:
|
@edsantiago another one |
/hold need to do more debugging #23326 (comment) |
The pod name was queried without holding the container lock, thus it was possible that the pod was deleted in the meantime and podman just failed with "no such pod" as the errors.Is() check matched the wrong error. Move it into the locked code this should prevent anyone from removing the pod while the container is part of it. Also fix the returned error, there is no reason to special case one specific error just wrap any error here so callers at least know where it happened. However this is not good enough because the batch doesn't update the state which means it see everything before the container was locked. In this case it might be possible the ctr and pod was already removed so let the caller skip both ctr and pod removed errors. Fixes containers#23282 Signed-off-by: Paul Holzinger <[email protected]>
/hold cancel |
@edsantiago please try this version, seems to work for me now (of course there are still other flakes) |
If a pod is removed when calling podman pod stats there is a race where the command might fail with no such pod. This is not a user error, like the ps/ls command skip it and move to the next one. Fixes containers#23327 Signed-off-by: Paul Holzinger <[email protected]>
Looks good - I'm now seeing the |
The flake is refusing to manifest on my laptop now. LGTM! |
/lgtm |
The pod name was queried without holding the container lock, thus it was possible that the pod was deleted in the meantime and podman just failed with "no such pod" as the errors.Is() check matched the wrong error.
Move it into the locked code this should prevent anyone from removing the pod while the container is part of it. Also fix the returned error, there is no reason to special case one specific error just wrap any error here so callers at least know where it happened.
Fixes #23282
Does this PR introduce a user-facing change?