Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for timeout for ExtPolicy disallowed delete test case #3320

Merged
merged 8 commits into from
Feb 14, 2025

Conversation

mgunnala
Copy link

@mgunnala mgunnala commented Feb 11, 2025

Description

Issue #

We are seeing intermittent test failures for the ExtPolicy (ext_policy.py) scenario for the following scenario:

  • block custom script with policy, then try to delete -> fails as expected
  • allow custom script with policy and retry the delete -> times out when it should succeed

2025-02-03T13:49:24Z.243 [ERROR] ******** [Failed] ExtPolicy: Fail: Unexpected error while trying to delete Microsoft.Azure.Extensions.CustomScript. Extension is allowed by policy so this operation should have completed successfully.
Error: [Delete Microsoft.Azure.Extensions.CustomScript] did not complete within 600 seconds!

This PR makes the following changes to address this issue:

  • for the delete on disallowed extension, wait for the full CRP timeout period, remove current workarounds, and move to the last test case
  • install guest config during test setup, so it is not automatically installed during the CRP timeout wait period

This PR also makes the following improvements to the test:

  • during test setup, only delete extensions being tested, instead of all extensions on the VM
  • add try/finally to cleanup policy in the case of test failure

PR information

  • Ensure development PR is based on the develop branch.
  • The title of the PR is clear and informative.
  • There are a small number of commits, each of which has an informative message. This means that previously merged commits do not appear in the history of the PR. For information on cleaning up the commits in your pull request, see this page.
  • If applicable, the PR references the bug/issue that it fixes in the description.
  • New Unit tests were added for the changes made

Quality of Code and Contribution Guidelines

self._ssh_client.run_command("update-waagent-conf Debug.EnableExtensionPolicy=y", use_sudo=True)

# Azure Policy automatically installs the GuestConfig extension on test machines, which may occur
# during the CRP timeout wait (test case 5), inadvertently resetting the timeout period.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"inadvertently resetting the timeout period" - what are the consequences of that?

btw, GuestConfig is only one of the extensions installed by policy

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just makes the test longer - the 15 minute timeout period restarts when CRP gets the GuestConfig enable request. So worst case scenario, GuestConfig is enabled ~14 minutes after the delete request, for a total waiting period of 29 minutes.

What other extensions are installed by policy?

Copy link
Contributor

@maddieford maddieford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@narrieta narrieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved by mistake

@maddieford maddieford merged commit 6444364 into Azure:develop Feb 14, 2025
11 checks passed
@mgunnala mgunnala deleted the delete_timeout branch February 14, 2025 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants