Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ray.go not to fail when going suspend state. #5816

Merged
merged 4 commits into from
Oct 23, 2024

Conversation

aminmaghsodi
Copy link
Contributor

We have a Queue to schedule our ray jobs, so we need it to wait in Suspended state of k8s
Now workflow fails, exactly when job state goes to 'Suspended'

We have a Queue to schedule our ray jobs, so we need it to wait in Suspended state of k8s

Signed-off-by: Amin Maghsodi <[email protected]>
Copy link

welcome bot commented Oct 6, 2024

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

Copy link

codecov bot commented Oct 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 36.81%. Comparing base (9abfbda) to head (c4fef38).
Report is 173 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5816      +/-   ##
==========================================
+ Coverage   36.35%   36.81%   +0.45%     
==========================================
  Files        1304     1309       +5     
  Lines      110147   130899   +20752     
==========================================
+ Hits        40042    48184    +8142     
- Misses      65938    78533   +12595     
- Partials     4167     4182      +15     
Flag Coverage Δ
unittests-datacatalog 51.58% <ø> (+0.21%) ⬆️
unittests-flyteadmin 54.03% <ø> (-1.57%) ⬇️
unittests-flytecopilot 11.73% <ø> (-0.45%) ⬇️
unittests-flytectl 62.40% <ø> (+0.18%) ⬆️
unittests-flyteidl 6.92% <ø> (-0.25%) ⬇️
unittests-flyteplugins 53.57% <100.00%> (+0.22%) ⬆️
unittests-flytepropeller 43.00% <ø> (+0.98%) ⬆️
unittests-flytestdlib 55.39% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Amin Maghsodi <[email protected]>
Improve test cov

Signed-off-by: Amin Maghsodi <[email protected]>
@aminmaghsodi aminmaghsodi changed the title Update ray.go Update ray.go not to fail when going suspend state. Oct 16, 2024
pingsutw
pingsutw previously approved these changes Oct 16, 2024
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. just curious, which gang scheduler are you using? do you need to change anything in Flyte to make your gang scheduler work with Ray task?

@@ -755,7 +755,8 @@ func TestGetTaskPhase(t *testing.T) {
{rayv1.JobDeploymentStatusRunning, pluginsCore.PhaseRunning, false},
{rayv1.JobDeploymentStatusComplete, pluginsCore.PhaseSuccess, false},
{rayv1.JobDeploymentStatusFailed, pluginsCore.PhasePermanentFailure, false},
{rayv1.JobDeploymentStatusSuspended, pluginsCore.PhaseUndefined, true},
{rayv1.JobDeploymentStatusSuspended, pluginsCore.PhaseQueued, true},
{rayv1.JobDeploymentStatusSuspending, pluginsCore.PhaseUndefined, true},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aminmaghsodi could we map it to pluginsCore.PhaseQueued as well?

@pingsutw pingsutw merged commit 8890b38 into flyteorg:master Oct 23, 2024
50 checks passed
Copy link

welcome bot commented Oct 23, 2024

Congrats on merging your first pull request! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants