Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable timeout for argocd-update step #3515

Closed
2 of 3 tasks
razvan-agape opened this issue Feb 18, 2025 · 4 comments
Closed
2 of 3 tasks

Configurable timeout for argocd-update step #3515

razvan-agape opened this issue Feb 18, 2025 · 4 comments

Comments

@razvan-agape
Copy link

Checklist

  • I've searched the issue queue to verify this is not a duplicate feature request.
  • I've pasted the output of kargo version, if applicable.
  • I've pasted logs, if applicable.

Proposed Feature

The default timeout for argocd-update operation is 5 minutes.
Kargo version: 1.2.0.

Motivation

In some scenarios, an application sync may take longer that, which will mark the step as failed, although, eventually, the sync might succeed.

Suggested Implementation

It would be useful to be able to configure the timeout, or have a configurable number of retries.

@krancour
Copy link
Member

https://docs.kargo.io/user-guide/reference-docs/promotion-templates#step-retries

@lknite
Copy link
Contributor

lknite commented Mar 15, 2025

@razvan-agape is this working for you? I tried the following and it doesn't seem to have any effect:

      - uses: argocd-update
        retry:
          errorThreshold: 1
          timeout: 2m0s

kargo v1.3.1

Image

Image

@krancour
Copy link
Member

@lknite, the timeouts are not exact because steps do not continuously retry internally. If the timeout hasn't elapsed, a step that's still running (waiting on something external) is retried on the next reconciliation attempt.

In general, those attempts are every five minutes, but the next attempt can be sooner if a related resource has a state change that forces the Promotion back onto the queue. It can also be later depending on the depth of the queue.

In your case, setting the timeout to 2m, may have no practical effect in the average case where the next reconciliation attempt is made (roughly) five minutes later. It worked well for @razvan-agape because he was raising the limit.

We're a little bit at the mercy of the controller runtime here since we don't have precise control over the interval before the next reconciliation...

That said, we can probably get closer to the specified timeout by shortening the requeue interval when timeout is sooner than when the next reconciliation attempt would typically be.

I will write up a separate issue for this as soon as I've got a chance.

@krancour
Copy link
Member

@lknite, I opened #3663

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants