Skip to content

Powershell execution fails sometimes with "WSMAN ERROR CODE: 1018" because of switched off retries #3385

Open
@wahlm

Description

@wahlm

Describe the Bug

Our powershell transport based bolt plans and tasks sometimes fail with "WSMAN ERROR CODE: 1018". This seems to be caused by a temporary problem in the windows system. Unfortunately we were not able to find the cause nor a workaround.
The used ruby module WinRb/WinRM in bolt provides an automatic retry in this case: The shell call is embedded in "retryable". The default is 3 retries with waiting time 10 seconds (exception class: WinRM::WinRMWSManFault).
But in the bolt code
https://github.com/puppetlabs/bolt/blob/main/lib/bolt/transport/winrm/connection.rb#L45
the retries are explicitly switched off by setting the retry_limit to 1. The option value given here is valid for the whole connection.
After increasing the retry_limit to a higher value, we don't see any WSMAN errors any more.
What is the reason why the retry is explicitely switched off?

Expected Behavior

Bolt plans and tasks must not fail because of temporary WSMAN problems in the windows system.

Steps to Reproduce

As we did not find out the cause for the WSMAN problems in windows os, we cannot give any advice to forcing them.

Environment

  • bolt-4.0.0
  • Windows Server 2022 in AWS

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugBug reports and fixes.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions