You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 7, 2021. It is now read-only.
@yifan-gu Had a test he wanted to write for pluton where:
1, worker is down
2, delete pod and checkpointer on master
3, worker comes back, still running pod and checkpointer
4, master comes back, work now see no pod and checkpointer being scheduled, so checkpointer cleans up everything
In this case it'd nice if we could issue a machine.Stop() and a machine.Start() command instead of a reboot. Not sure if this is a feature we should bake into the platform package or not. I think we can work around this for now by issueing machine.SSH("sudo systemctl mask kubelet.sevice") and machine.Reboot() for our stop command. Then run machine.SSH("sudo systemctl enable --now kubelet.service") for our start command.
Another data-point here is that I think upstream will use iptables to blackhole certain nodes for destructive tests. Seems like that could be a separate feature that could be implemented in a fully platform independent way. Reboot() is a platform specific implementation right now I believe.
I've also been wanting a machine.Start() to provide a more flexible machine setup by being able to call methods to configure networking and so on before booting the machine for the first time.
I just want to note that although start/stop will be easy to implement on gce and aws the qemu code will need some reworking in order to do that. So for now please stick with finding alternative ways to implement such tests as I shouldn't get sucked into redoing that code just yet.
@yifan-gu Had a test he wanted to write for pluton where:
In this case it'd nice if we could issue a
machine.Stop()
and amachine.Start()
command instead of a reboot. Not sure if this is a feature we should bake into the platform package or not. I think we can work around this for now by issueingmachine.SSH("sudo systemctl mask kubelet.sevice")
andmachine.Reboot()
for our stop command. Then runmachine.SSH("sudo systemctl enable --now kubelet.service")
for our start command.Another data-point here is that I think upstream will use iptables to blackhole certain nodes for destructive tests. Seems like that could be a separate feature that could be implemented in a fully platform independent way.
Reboot()
is a platform specific implementation right now I believe.cc @marineam
The text was updated successfully, but these errors were encountered: