Cost and Efficiency Concerns with Message Deletion and Re-sending in AWS SQS and Support for FIFO Queues #162
JhonCampos
started this conversation in
General
Replies: 1 comment
-
Hi! Yes, your analysis is correct. I would sum it up like this:
However:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone,
I would like to raise a concern regarding the current mechanism in the bref-laravel-bridge library that always deletes the message from the SQS queue, even when the processing fails, and then re-sends a new message to the queue. This approach seems to have been implemented for compatibility with Laravel’s queue system, but it introduces additional costs and inefficiencies, particularly when used with AWS SQS.
Current Behavior:
When a message is processed (successful or not): The message is always deleted from the queue using DeleteMessage.
If processing fails: A new message is sent to the queue, creating a fresh SendMessage request.
Issues with this Approach:
Increased Costs: Each time a message is processed and deleted, it incurs an additional DeleteMessage request. When a message fails and a new one is sent, there are additional SendMessage and ReceiveMessage requests. This doubles the number of requests for every failed message, leading to higher costs.
For each failed message, instead of relying on the built-in visibility timeout feature of SQS, which would re-queue the message automatically without needing to delete and re-send, we are creating 4 requests per retry: SendMessage, ReceiveMessage, DeleteMessage, and then another SendMessage after failure.
Using the built-in retry mechanism in SQS (without deleting the message until it’s processed successfully) would cut the number of requests down to 2 per retry, significantly reducing costs.
Loss of Visibility Timeout Utility: By deleting the message and immediately re-sending a new one, the system bypasses the benefit of visibility timeout. The visibility timeout is meant to provide a buffer time during which the message is invisible to other consumers, allowing for retries after a delay.
In this case, since the Lambda (acting as the consumer) processes and re-sends the message instantly, it eliminates the ability to control retry delays based on visibility timeout, which could be important in scenarios where retries need to be spaced out (e.g., waiting for external dependencies to become available before retrying).
Proposal:
Switch to the Built-in SQS Retry Mechanism: Modify the library to only delete the message after successful processing, allowing SQS to handle automatic retries based on the visibility timeout when processing fails. This will reduce the number of requests (and therefore costs) and make better use of SQS’s retry capabilities.
Support for FIFO Queues: In addition, I would suggest exploring support for FIFO queues. FIFO queues offer the benefit of ensuring that messages are processed exactly once and in the correct order, which could be useful for certain workloads where order and deduplication are important.
In FIFO queues, you have more precise control over message processing, and the combination of FIFO queues with the retry mechanism could bring additional benefits such as better sequencing and deduplication (thanks to message group IDs and deduplication IDs).
Final Thoughts:
The current method of deleting and re-sending messages upon failure not only increases costs but also limits the use of SQS’s built-in features, like visibility timeout and automatic retry handling. Adding support for FIFO queues could also open up new use cases where ordering and deduplication are critical. I believe addressing these points would improve both the efficiency and the flexibility of the bref-laravel-bridge library.
I would appreciate any thoughts or feedback on this issue and look forward to discussing potential improvements.
Beta Was this translation helpful? Give feedback.
All reactions