consumeNewest understanding #1361
Replies: 2 comments
-
Clearly, what you described so far is a real bug, and the second loop must be the real issue and an be remove. |
Beta Was this translation helpful? Give feedback.
-
Ok thank you for the answer ! I'll work on it then and propose a PR |
Beta Was this translation helpful? Give feedback.
-
Hi ! I'm digging into the source code to understand some painpoint we can have currently and there is a topic I'd like to discuss.
Assuming we are using the Newest sort by default to ease topic analysis and a poll-timeout set to 15s by default I found that we will always have a response time greater than 15s to consult topic data even with topics with few data. If we leave the timeout to 1s it's ok but I'm not sure if this behaviour is expected or not
I did some tests locally and added some logs and found that this response time is caused by the do/while empty polling in the consumeNewest function. Basically we will do a 1st loop turn to get 15 records (5 records by partitions - 3 partitions in the topic) by consuming the topic from the last offset of each partition minus 5 records. Normally we should stop here because we have our 15 records to return but we will do an extra loop turn to consume data from the real end offset but because we are at the end, we will wait 15s to the timeout and then return the result to the user.
I did a picture to explain the way it works with a topic / 3 partitions and I'd like to know more about it and why the consumeNewest is directly impacted by the poll-timeout variable. IMO we can either remove the do/while loop or add a condition in the while to stop if we are at the end offset but I'd like some insights before doing a PR to check if this loop is needed elsewhere
Thanks !
Beta Was this translation helpful? Give feedback.
All reactions