-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not use beta API for hyperdisk in multi-writer mode. #1864
Conversation
|
Welcome @karkunpavan! |
Hi @karkunpavan. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
Detailed code changes as requested. Updated the PR with all of these code changes.
|
/retest-required The failures are SSH timeouts which seem to be unrelated to my changes. |
/retest - seems like a flaky test which timed out. |
@karkunpavan: The
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
Update change log
…lows for more accurate error code reporting if gRPC functionality is refactored
…ging Refactor metric defer() statements to gRPC metric interceptor
Don't overwrite libc in distroless debian base image
update prow rc with 1.15.3-rc1 release candidate
Require VACs to use SI units
The volume attribute class tests are valid for 1.31+ cluster By making the param configurable in the run-k8s-integration-*.sh we can disable the flag in k8s clusters lesser than intended minor version
Make the volume attribute class file a configurable input in the tester script
Skip xfs test for GCE test skip
Make a new release candidate with 1.15.3
This fixes a regression introduced in kubernetes-sigs#1876 where the driver would start panicking on startup if `--http-endpoint` was specified. This was caused by the metrics not being initialized anymore during startup. The proposed fix involves using the `Reset` methods of the metrics object instead of trying to redefine them each time they need to be reset.
This is not a replacement for standard-rwx. It enables multiwriter on a block device, but this does not mean you have a multiwriter filesystem. A distributed filesystem is hard to make, and multiwriter devices is only one piece of the puzzle. ext4 and xfs are not distributed. |
@mattcary I understand, my case specifically is that we have Performance pods (= one pod per node) running ML models on GKE, and we'd like to enable rolling updates for it while keeping the models persistent to reduce overhead. Right now, this rules us out from either doing rolling updates or using hyperdisk, due to a "true" distributed FS requirement, which has very high overhead. Making it read-only would prohibit us from using some Python libraries we use like stanza. Our understanding is that this is the k8s equivalent to using a cannon to catch a fish and in practice a much simpler solution would suffice, since writes are extremely rare and no more than one pod would be writing at any given time (we can also guarantee this using external locks if needed). Am I understanding this incorrectly? |
@gtomitsuka The kernel filesystem modules try very hard to do things like cache in memory. So even if writes are rare, the in-memory structures on different machines are going to be out of date and desynchronized. It seems possible to have some ROX with a single writer, which unmounts all readers, updates the single writer, and then remounts, while keeping all the disks attached. But that will require a new csi driver anyway. |
Use correct path in error message for udev tooling
…am/fix-panic [metrics] Fix panic during metrics manager startup
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: karkunpavan The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Closing this pull request, will raise a new one with planned changes |
@karkunpavan: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Hyper disk multi-writer support is now in compute/v1 but the persistent disk csi driver still calls v0.beta which leads a error while creating a hyperdisk with multi-writer mode. Error:
To fix this we will call the v1 API which accepts
accessMode = ReadWriteMany
.Which issue(s) this PR fixes:
Fixes #1863
Special notes for your reviewer:
hyperdisk*
, multi-writer support for other disk types is not the focus of this fix.Does this PR introduce a user-facing change?: