You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly I am aware that mirrored classic queues are about to be removed. Opening this issue mostly to document the observed behaviour.
Given a mirrored classic queue with max-length[-bytes] and overflow: reject-publish. When the queue is full and a client with a long-lived connection and publisher-confirms disabled, publishes messages to this queue, the memory of the slave processes grow continuously eventually leading to an OOM. (The memory is released when the connection is closed)
What happens is that the channel process sends published messages to both the master and slave processes of the queue, and the slaves temporarily store them in the sender_queues field of their state (maybe_enqueue_message). When the queue is full and publisher-confirms are enabled the master also broadcasts a discard message to the slaves (in send_reject_publish) which removes the message from the sender_queue (in publish_or_discard). However if publisher-confirms are disabled it does not send anything to the slaves (in send_reject_publish), so the sender_queues structure is growing indefinitely.
We speculate that the issue also exists if the messages are published to the mirrored queue via dead-lettering.
Reproduction steps
Create a multi-node cluster for example on 3.12.6 (I tested on main 09a95a5)
Create a policy for all classic-queues with ha-mode: all
Create a classic queue with max-length: 10 and overflow: reject-publish, leader node being rabbit-1
Open an AMQP connection and publish messages to the queue continuously (without enabling publisher-confirms). The queue will have 10 messages. The memory on rabbit-1 and the memory of the queue master process remains stable. However the memory on rabbit-2 and rabbit-3 and the process memory of the queue slave processes will continually grow.
...
Expected behavior
The memory of the queue slave processes remains stable.
Additional context
Because the variable_queue:discard is a noop, apart from mirrored classic queues this issue of a missing discard call probably does not affect any of the queues included in RabbitMQ.
However it might affect queue types provided by community plugins that are based on the classic queue (non-mirrored). As the example of the message deduplication plugin shows there might be some plugins that make use of the discard callback. (noxdafox/rabbitmq-message-deduplication#96) Hence I think it is worth considering including a fix (which I'm willing to submit)
The text was updated successfully, but these errors were encountered:
reject-publish without publisher confirms does not make much sense. A workaround could be a way to forcefully enable publisher confirms for a virtual host, or even all AMQP 0-9-1 channels on a node.
luos
pushed a commit
to esl/rabbitmq-server
that referenced
this issue
Oct 30, 2024
Describe the bug
Firstly I am aware that mirrored classic queues are about to be removed. Opening this issue mostly to document the observed behaviour.
Given a mirrored classic queue with
max-length[-bytes]
andoverflow: reject-publish
. When the queue is full and a client with a long-lived connection and publisher-confirms disabled, publishes messages to this queue, the memory of the slave processes grow continuously eventually leading to an OOM. (The memory is released when the connection is closed)What happens is that the channel process sends published messages to both the master and slave processes of the queue, and the slaves temporarily store them in the
sender_queues
field of their state (maybe_enqueue_message
). When the queue is full and publisher-confirms are enabled the master also broadcasts adiscard
message to the slaves (insend_reject_publish
) which removes the message from the sender_queue (inpublish_or_discard
). However if publisher-confirms are disabled it does not send anything to the slaves (insend_reject_publish
), so thesender_queues
structure is growing indefinitely.We speculate that the issue also exists if the messages are published to the mirrored queue via dead-lettering.
Reproduction steps
ha-mode: all
max-length: 10
andoverflow: reject-publish
, leader node being rabbit-1...
Expected behavior
The memory of the queue slave processes remains stable.
Additional context
Because the variable_queue:discard is a noop, apart from mirrored classic queues this issue of a missing discard call probably does not affect any of the queues included in RabbitMQ.
However it might affect queue types provided by community plugins that are based on the classic queue (non-mirrored). As the example of the message deduplication plugin shows there might be some plugins that make use of the discard callback. (noxdafox/rabbitmq-message-deduplication#96) Hence I think it is worth considering including a fix (which I'm willing to submit)
The text was updated successfully, but these errors were encountered: