Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify expected behaviour for forced reboot with zero delay #101

Open
RonanMacF opened this issue Oct 12, 2022 · 4 comments
Open

Clarify expected behaviour for forced reboot with zero delay #101

RonanMacF opened this issue Oct 12, 2022 · 4 comments

Comments

@RonanMacF
Copy link

RonanMacF commented Oct 12, 2022

Hi,

When a forced reboot is issued what is the expected response from the server?

I appreciate that the server should ideally always issue a response, but when a forced reload occurs the box can shut down quickly killing the connection leading to a status.UNAVAILABLE error code. If you want to guarantee a response from the server then a response may have to be returned before performing the reboot. This reload may inevitably fail for whatever reason and other would be no way to propagate this back to the client with a unary RPC.

edit: this also applies to reloads with short delays. It can potentially apply to non-forced reloads but I presume that enough time will have passed then to provide a response (although this may not be guaranteed)

@hellt
Copy link
Contributor

hellt commented Nov 2, 2022

In my personal opinion, I'd prefer to send the RebootResponse before gNOI server sends the signal for a reboot to the chassis(active CPM) with a (zero) delay.

That way, a client can be assured that gNOI server received a request and scheduled the reboot execution. So far, this is unclear what is expected from a gNOI server implementation, so it'd be good to have a round of discussion about it.

/cc @robshakir

@RonanMacF
Copy link
Author

I appreciate the value in a response, what I don't like is that there is no longer a guarantee that the reboot will succeed as an issue could occur when actually applying the reboot. Furthermore this may mean that the server needs to add some artificial delay in so that any reboot below a couple of seconds seconds is really a reboot in, let's say 5 5 seconds, as if we don't give the server a chance to return the response before reloading then it will cause this same issue

@hellt
Copy link
Contributor

hellt commented Nov 2, 2022

I think one can't guarantee a successful reboot for an active CPM (or the whole chassis) either. As in that case the gNOI server will shutdown and the underlying TCP session will be torn down. I think in that case you don't event get a grpc error code?

@RonanMacF
Copy link
Author

The server going down would be this indication, this can be verified after by running a RebootStatus command, i.e. if RebootStatusResponse.Count is the same as RebootStatusResponse.Count pre reboot + 1 then you know it happened.

If a user is requesting a forced reload now then I'm not sure graceful handling of it would be expected. You will get a gRPC error on server shut down, a grpc.Unavailable error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants