Skip to content

sap_install_media_detect: move_files_to_main_directory worker dead state #1052

Open
@remrozk

Description

@remrozk

Ansible Role

sap_install_media_detect

OS Family

N/A

Ansible Controller - Python version

Python 3.11.13

Ansible-core version

ansible [core 2.16.14]

Bug Description

Issue

I am using the following SAP architecture:

  • 1 SAP HANA DB
  • 1 ABAP Central Services (ASCS)
  • 1 Primary Application Server (installing both DB instance and CI)
  • 1 Additional Application Server (DI)

The Ansible worker is a container running on a host with 12 vCPUs and 24 GB of RAM, with no resource limits set on the worker itself.

During the installation of SAP S/4HANA 2021 ABAP (note: this issue has only been seen with this version, not with S/4HANA 2023 ABAP), I executed the following roles consecutively to install the database instance and central instance on the same host:

  1. community.sap_install.sap_install_media_detect
  2. community.sap_install.sap_swpm

However, the Ansible worker crashed during execution of the following task in the sap_install_media_detect role for the CI installation:

SAP Install Media Detect - Prepare - Move files to parent for known subdirs - Set fact from find result, phase 1b

The logs showed this:

2025-06-17 23:01:09,899 p=3568 u=root n=ansible | TASK [community.sap_install.sap_install_media_detect : SAP Install Media Detect - Prepare - Move files to parent for known subdirs - Set fact from find result, phase 1b] ***
...
2025-06-17 23:01:33,967 p=3568 u=root n=ansible | ERROR! A worker was found in a dead state

Root Cause

The task uses the following Ansible logic:

- name: SAP Install Media Detect - Prepare - Move files to parent for known subdirs - Set fact from find result, phase 1b
  ansible.builtin.set_fact:
    __sap_install_media_detect_fact_find_result_phase_1b: "{{ __sap_install_media_detect_fact_find_result_phase_1b + [item.1.path] }}"
  loop: "{{ __sap_install_media_detect_register_find_result_phase_1b.results | subelements('files') }}"
  loop_control:
    label: "{{ item.1.path }}"

This approach creates a new list with every loop iteration using:

__sap_install_media_detect_fact_find_result_phase_1b + [item.1.path]

With a large number of iterations, this results in excessive CPU consumption and memory pressure due to inefficient list handling in Ansible's set_fact, which can ultimately crash the worker.

To compare the number of files in the download basket per S/4HANA version (with latest Support Packages) generated by the Maintenance Planner:

  • ~600 files for S/4HANA 2021
  • ~400 files for S/4HANA 2022
  • ~300 files for S/4HANA 2023

Solution

The issue can be resolved by avoiding per-iteration list rebuilding. Instead, collect all paths at once in a single set_fact assignment:

- name: SAP Install Media Detect - Prepare - Move files to parent for known subdirs - Set fact from find result, phase 1b
  ansible.builtin.set_fact:
    __sap_install_media_detect_fact_find_result_phase_1b: >-
      {{
        __sap_install_media_detect_register_find_result_phase_1b.results
        | map(attribute='files')
        | flatten
        | map(attribute='path')
        | list
      }}

The second solution is "shorter" but needs the jmespath pip package.

- name: SAP Install Media Detect - Prepare - Move files to parent for known subdirs - Set fact from find result, phase 1b
  ansible.builtin.set_fact:
    __sap_install_media_detect_fact_find_result_phase_1b: >-
      {{
        __sap_install_media_detect_register_find_result_phase_1b.results
        | json_query("[].files[].path")
      }}

This method significantly improves performance, reducing task execution time and preventing the Ansible worker from crashing.

How do you feel about this solution?

Bug reproduction

Install the database instance and the Central Instance (CI) with all Support Packages for S/4HANA 2021 ABAP on the same host.
Use the roles community.sap_install.sap_install_media_detect followed by community.sap_install.sap_swpm twice — first for the database instance, and then for the Central Instance.

Community participation

Happy to implement this bug fix

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions