Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX EE crashes without an error message #75

Open
kazigk opened this issue Jun 16, 2021 · 1 comment
Open

AWX EE crashes without an error message #75

kazigk opened this issue Jun 16, 2021 · 1 comment

Comments

@kazigk
Copy link

kazigk commented Jun 16, 2021

I have one specific playbook using strategy: free and I execute it on over 100 hosts.
It produces about ~1700 lines of logs and in most cases it does have an "error" status on AWX web interface.

There's no error message in job output on AWX web, so I checked logs from the pod itself using kubectl logs -n awx -f automation-job-81-5nmhw and this is the last line of said logs (formatted using Beautifier):

{
    "uuid": "4b8c7664-a8a6-4295-836b-9480b8766a4b",
    "counter": 2886,
    "stdout": "",
    "start_line": 1674,
    "end_line": 1674,
    "runner_ident": "81",
    "event": "runner_on_start",
    "job_id": 81,
    "pid": 20,
    "created": "2021-06-16T08:16:04.477365",
    "parent_uuid": "c676cd20-5b22-898c-8cc6-00000000007f",
    "event_data": {
        "playbook": "restart_apps.yml",
        "playbook_uuid": "19727f38-00ac-41f2-a46d-44d0ed81c721",
        "play": "apps",
        "play_uuid": "c676cd20-5b22-898c-8cc6-000000000078",
        "play_pattern": "apps",
        "task": "Start apps",
        "task_uuid": "c676cd20-5b22-898c-8cc6-00000000007f",
        "task_action": "shell",
        "task_args": "",
        "task_path": "/runner/project/restart_apps.yml:37",
        "host": "REDACTED",
        "uuid": "4b8c7664-a8a6-4295-836b-9480b8766a4b"
    }
}

Seems like it just crashes without any error message?
The interesting part is that the playbook does all the changes to the servers, just the output is incomplete.

After executing the same playbook on a small part of my inventory, it finishes successfully with play recap and those lines:

{"status": "successful", "runner_ident": "82"}
{"zipfile": 2998}
<BASE64 encoded zip file>

I tested it on the following versions of AWX EE: 0.2.0, 0.3.0 and 0.4.0 with the same result.
I also checked logs of awx-web, awx-task and awx-ee, but I didn't find anything useful.

AWX Version: 19.2.0
AWX Operator version: 0.10.0

Is there anything else I can check?

@shanemcd
Copy link
Member

This smells a lot like ansible/awx#9961

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants