Skip to content

Retry Build and Functional Test Steps when timeout occurs #14821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: 7.0.x
Choose a base branch
from

Conversation

jamesfredley
Copy link
Contributor

@jamesfredley jamesfredley commented Jun 18, 2025

Attempts to address: Connect to repository.apache.org:443 [repository.apache.org/65.109.119.155] failed: Connect timed out

max_attempts: 3
retry_wait_seconds: 600

nick-fields/retry is already used in publish

image

image

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request updates the Gradle workflow to use a retry action when the build step times out.

  • Replaces the simple run command with a retry action from nick-fields/retry
  • Sets additional parameters (timeout, max attempts, and retry wait) for improved build reliability
Comments suppressed due to low confidence (1)

.github/workflows/gradle.yml:92

  • The comment indicates a normal range of 20-40 minutes, but 3600 seconds represents 60 minutes. Please update the comment or the timeout value to reflect the intended behavior.
          timeout_seconds: 3600 # normal range 20-40m

@jamesfredley jamesfredley moved this to In Progress in Apache Grails 7.0.x Jun 18, 2025
@jamesfredley jamesfredley changed the title Retry Build Step when timeout occurs Retry Build and Functional Test Steps when timeout occurs Jun 18, 2025
@matrei
Copy link
Contributor

matrei commented Jun 18, 2025

So, if I understand this correctly, this will retry the step for any failure, not just timeouts?
Like, if there is a genuine test failure, it will still wait 10 minutes and then retry?

@jdaugherty
Copy link
Contributor

jdaugherty commented Jun 18, 2025

So it's my understanding that RAO will be back to normal within the week (https://the-asf.slack.com/archives/CBX4TSBQ8/p1749838588937849). These issues are temporary (but painful https://the-asf.slack.com/archives/CBX4TSBQ8/p1750077904096649). I think we should hold off on this and continue to manually retry. Auto retrying will likely lead to more issues.

@jamesfredley
Copy link
Contributor Author

jamesfredley commented Jun 18, 2025

nick-fields/retry does support retry_on, although I'm not confident it is passed these failures in a way that will match up.

retry_on
Optional Event to retry on. Currently supports [any (default), timeout, error].

Every CI run failing is very painful at present.

@jamesfredley
Copy link
Contributor Author

Interestingly on https://github.com/apache/grails-core/actions/runs/15731765021/job/44334333078?pr=14821 there were 0 timeouts on the first attempt, which is the first time I have seen that in a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants