Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

prematurely "context canceled" on gRPC stream when timeout is set #29

Closed
thiagodpf opened this issue Jul 19, 2023 · 3 comments · Fixed by #47
Closed

prematurely "context canceled" on gRPC stream when timeout is set #29

thiagodpf opened this issue Jul 19, 2023 · 3 comments · Fixed by #47
Assignees
Labels
bug Something isn't working

Comments

@thiagodpf
Copy link
Contributor

Brief summary

Hello again folks!

I'm using the experimental module k6/experimental/grpc to perform load tests on a service that communicates using bidirectional gRPC stream.

When creating the stream from the gRPC client (based on this example), I am passing the timeout parameter and this is causing the problem of the "context being canceled prematurely" (in golang) when I close the sending stream (client side).

k6 version

v0.45.0

OS

Ubuntu 20.04.6 LTS

Docker version and image (if applicable)

No response

Steps to reproduce the problem

See the script snippet below, notice that after sending all items to the server, I am closing the (client) sending stream (but the receiving stream must remain open until the server closes it OR the timeout is reached):

  // notice the timeout at the end of the line ------------------------------vvvv
  const stream = new Stream(grpcClient, 'grp.Service/MyEndpoint', { timeout: '2m' });

  // register the response listener...
  stream.on('data', (response) => {
    doSomeChecks(response);
  });

  // send some items to the server...
  for (let i = 0; i < 5; i++) {
    stream.write(item);
  }

  // close the client stream
  stream.end();

What happens in practice is that after closing the stream (.end()) on the client side, due to the configured timeout (in the script), the code that runs behind the scenes (in Golang) is canceling the context used by the whole stream (before the timeout has been reached!!!). However, when doing so, the receiving stream is also canceled and the server does not have time to send its responses.

When the sending stream is closed (in the script 👆), the Golang code behind the scenes goes through here 👇:
https://github.com/grafana/k6/blob/703970eb58d756451c073d0797dce36e067bc386/vendor/github.com/grafana/xk6-grpc/grpc/stream.go#L222-L233
note that on line 228 👆 a function is queued and it will call the function s.closeWithError(err) (line 229)...

...and looking inside the closeWithError function 👇, we can see that if there is a timeout defined (line 336), it will be canceled prematurely:
https://github.com/grafana/k6/blob/703970eb58d756451c073d0797dce36e067bc386/vendor/github.com/grafana/xk6-grpc/grpc/stream.go#L325-L338

Unfortunately I don't know how to suggest a fix, as I lack knowledge of the logic you guys used behind this bidirectional control mechanism. But I think it's worth registering this issue.

The workaround I currently have is to not set the timeout on stream creation (in the script). Unfortunately, if the server takes too long to complete the response (e.g. in a stress test measuring throughput), this results in a VU stuck in one iteration until it reaches the scenario's gracefulStop. 😞

Expected behaviour

It is expected that when closing the stream on the client side, if there is a timeout defined and no errors are passed to the closeWithError(err) function, then the context should not be canceled prematurely.
At least that's how I see the solution to the problem.

Actual behaviour

The context is being cancelled, regardless of whether there is an error or not (as a function parameter: closeWithError(err))

@thiagodpf thiagodpf added the bug Something isn't working label Jul 19, 2023
@olegbespalov olegbespalov transferred this issue from grafana/k6 Jul 20, 2023
@olegbespalov
Copy link
Collaborator

Hi @thiagodpf . Yes, it definitely deserves a better investigation. Moving this into the xk6-grpc where the fix/workaround should be proposed. Thanks for the report! 🙇

@thiagodpf
Copy link
Contributor Author

Hi @olegbespalov !

I think "stream context cancellation" would make more sense in cases where an error is actually passed to the function, like this:

image

But more investigation is needed on the impact of not canceling the stream's context when not passing an error to the function (i.e. err == nil)
I imagine that this code needs to be adjusted to consider that the stream has two ways... that is, even if it is closed at the client end, the server does not close its end and can still send data, unless the context is canceled by the client (due to an error) or reach the timeout. 🤔🤔🤔

I think this question will still remain open for some time because it is an experimental module, right? 😅

Here on my side, it's okay to continue without the timeout. I just thought it was interesting to raise this point of attention about this setting. 🔧

@thiagodpf
Copy link
Contributor Author

thiagodpf commented Jul 21, 2023

I opened this PR 👆 as a suggestion that seems coherent to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants