You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i have a simple pipeline that pull messages from kafka in batch and persist to s3. The input batch size is 50 messages while the output is 20. i am expecting the output batch size is 20, which is not the case, it says 50. When i remove the input batching policy, the output batch size is 20 as expected.
hope i understood the concept of batching in the context of redpanda connect correctly.
say the input has 100 records, redpanda cuts them into 2 parts evenly in this case.
each batch undergoes the processor after which we still have 2 batches with 50 messages in each of them
when a batch of 50 messages arrives the output, it will be further cut into smaller batches with 20 message
the processors in the batching is handling 20 messages each time, so the batch_size() should return 20 instead of 50
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
i have a simple pipeline that pull messages from kafka in batch and persist to s3. The input batch size is 50 messages while the output is 20. i am expecting the output batch size is 20, which is not the case, it says 50. When i remove the input batching policy, the output batch size is 20 as expected.
hope i understood the concept of batching in the context of redpanda connect correctly.
batch_size()
should return 20 instead of 50Many thanks!
my connect.yaml as follow
Beta Was this translation helpful? Give feedback.
All reactions