-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting fatal error on Get list of labels.
task
#78
Comments
I have the same issue except for 'ambigious' instead of not found. Edit: Workaround for me was simply making the node leave. 'docker swarm leave --force'. |
Thanks @cpxPratik for reporting this issue. Can you please confirm me the docker version you are using? |
@atosatto The docker version is |
Hello, I'm having the same issue on a cluster. It seems the node object is using hostname instead of the full FQDN. It seems the this is the root-cause: 3bb8a49 Though, I don't see any references in the playbook that it joins by FQDN, is this a new change on upstream docker? |
btw, I'm currently running version 19.03.8 |
Same issue here, using 19.03.6 (latest Ubuntu 18.04 provided docker.io package) |
I have same issue too. Ubuntu 18.04. |
@atosatto We fixed this a while back but it was reverted or we mixed it up. It's inventory_hostname vs fqdn. |
Workaround for me was: replace |
Hello. I'm also having this issue. Any plans to reaply the fix? Thanks! |
…cluster setup
Seeing the same behaviour on |
Another case where this happens is the following: I had botched my swarm setup, so it was not about node names (e.g. inventory name or fully qualified domain name (fqdn)), but the nodes were no longer seen by the manager. The role doesn't handle this (no judgement meant) currently. I think it's a split brain/no brain kind of thing, because I had restarted my manager (and I run only one) and then this happened. The fix was the following:
And then the role completes. The other fix is to run two managers. ;-) I am not entirely sure how this could be added to the role since the manager doesn't see the workers anymore, but the works think they are still connected. If you can afford it, trash the nodes and setup again. Maybe it's a documentation thing after all? |
Same issue on Centos 7. For now I am using v2.2.0 which works like a charm ! |
Now I'm not sure if this has to do with this at all, as I have been getting this error several times too with that commit reverted. It always happens when I add a new instance to the cluster. First time I run this role is ok, then I create a new aws instance and run again this role to add it to the cluster and the role fails with this error. This is the error message I'm seeing being thrown by ansible on nodes that are already part of the cluster:
That is the error for the manager, but the workers throw it too. |
For me it also happens with v2.2.0 as described on my previous comment. |
I had to use this role again and got an error when running it for the second time, and this time I noticed that the error was different to the one of this issue (and probably the error reported in my previous comment was about this new issue and not related to this one). This time the error is on the "Remove labels from swarm node" task, and it occurs when labels are configured outside this role (ie, manually adding a role to a node). I will create a separate issue for that with an accompanying PR fixing it. |
Added issue #96 for this and fixed on PR #97, I hope it gets merged (although I do not have my hopes up that it will happen hreh). |
…cluster setup
The task
Get list of labels.
is failing after updated withansible_fqdn
on 3bb8a49The node hostname(
staging-manager-03
) ondocker node ls
is different from the fqdn string given on following error:For now I am using v2.2.0 which gives no error.
The text was updated successfully, but these errors were encountered: