Accuracy of inactive threshold for ephemeral consumers #778
Replies: 5 comments 4 replies
-
@erdemiru I will look at this in detail, but off the top of my head
2000 + 2000 + 2000 = 6000; 6 seconds. The subscription is already inactive because it did not get another pull. Reading and acking I'm pretty sure do not reset the threshold (I'll verify) I'm surprised you got the second set of 3. Either way it's not exact. The server is doing lots of work. The subscription may stay active longer than the threshold because of it, but won't be less. |
Beta Was this translation helpful? Give feedback.
-
We are working on improvements for handling inactive consumers, but it IS VERY difficult to know. You can ask the server for consumer info, but that's a round trip to the server. This are now heartbeats on pulls, so that's one way we will try to address inactivity. |
Beta Was this translation helpful? Give feedback.
-
Hi @scottf, Thanks for your quick response. It is interesting to hear acknowledging a message does not reset the threshold as I would expect it is a clear indication that the consumer is in the active state. I also tried some other examples:
In both cases, between pull interval is 10 seconds. As a workaround, we can set inactive threshold something longer than max. processing time x batch size. |
Beta Was this translation helpful? Give feedback.
-
I don't know if that is directly related to this issue, but I also see duplicate messages in some scenarios. Let's say we set inactive threshold to 10 minutes to isolate the inactive threshold limit issue. number of messages:57 batch size:100 (greater than the available messages), processing delay:1000 In that case, consumer receives duplicates messages,
Calling msg.ack() or msg.ackSync() before the sleep doesn't fix it. It seems related to ack_wait (which is 30 seconds by default). Setting it to a larger value avoid message duplicates. Are the acknowledgements some how delayed for batch pull requests? |
Beta Was this translation helpful? Give feedback.
-
Is it taking you 30 seconds to ack something? Maybe you need to reduce your batch size and/or increase your ack wait. Ack wait and redelivering are a fundamental server things, I'd be surprised if it's broken, but I suppose it could be. I'm moving this issue to a discussion. |
Beta Was this translation helpful? Give feedback.
-
Defect
Versions of
io.nats:jnats
andnats-server
:nats-server: 2.9.3
io.nats:jnats: 2.16.1
OS/Container environment:
Steps or code to reproduce the issue:
NATS documentation indicates that the default value of the inactive threshold is 5 seconds. When processing messages, if I add a processing delay (e.g. 3s) after several pull requests, nextMessage() always returns null, even though the subscription is active and there are still more messages on the server.
The example project contains two test methods with different test parameters.
shouldConsumeAllMessagesWithBatchPull method
shouldConsumeAllMessages is a slightly more complex test method
Some additional information:
Expected result:
Actual result:
Beta Was this translation helpful? Give feedback.
All reactions