Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if a rabbitmq is not running #485

Open
jeroenmaelbrancke opened this issue Aug 22, 2018 · 0 comments
Open

Check if a rabbitmq is not running #485

jeroenmaelbrancke opened this issue Aug 22, 2018 · 0 comments

Comments

@jeroenmaelbrancke
Copy link
Contributor

The rabbitmq tests on the healthcheck only verify if there is a partition problem or not.
If one node is not running the healthcheck doesn't report this.

If the queue of a particular worker is not running the workers are not able to proceed and ending up with following error:

Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:     nowait=nowait)
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:   File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1256, in queue_declare
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:     (50, 11),  # Channel.queue_declare_ok
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:   File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 69, in wait
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:     return self.dispatch_method(method_sig, args, content)
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:   File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 87, in dispatch_method
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:     return amqp_method(self, args)
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:   File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 243, in _close
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]:     reply_code, reply_text, (class_id, method_id), ChannelError,
Aug 22 11:35:41 ds1-stor-05.ds1 celery[38956]: NotFound: Queue.declare: (404) NOT_FOUND - home node 'rabbit@ds1-stor-02' of durable queue 'ovs_1cu3dqsKJReUqgJK' in vhost '/' is down or inaccessible

There is also a bug in RabbitMQ if autoheal kicks in that the node crash.
rabbitmq/rabbitmq-server#928

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants