Skip to content

Commit

Permalink
[serve][bugfix] Clip number of retries shown in error message (#49318)
Browse files Browse the repository at this point in the history
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

The status message should never display that serve will retry a
"negative" amount of times. This can happen if the retry counter is
larger than the failed threshold. If this is the case, just clip the
lower bound to be 0 to avoid the confusing status message.

## Related issue number
N/A
<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: akyang-anyscale <[email protected]>
  • Loading branch information
akyang-anyscale authored Dec 19, 2024
1 parent cff17ee commit 5efdc5a
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions python/ray/serve/_private/deployment_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -2052,9 +2052,10 @@ def record_replica_startup_failure(self, error_msg: str):

retrying_msg = "Retrying"
if self._failed_to_start_threshold != 0:
remaining_retries = (
remaining_retries = max(
self._failed_to_start_threshold
- self._replica_constructor_retry_counter
- self._replica_constructor_retry_counter,
0,
)
retrying_msg += f" {remaining_retries} more time(s)"

Expand Down

0 comments on commit 5efdc5a

Please sign in to comment.