Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write abort event for aborted k8s workflows #5785

Closed

Conversation

andresgomezfrr
Copy link
Contributor

@andresgomezfrr andresgomezfrr commented Sep 30, 2024

Tracking issue

Potentially closes #5547 and #5786

Why are the changes needed?

To make sure Workflows are able to transition from Aborting to Aborted if flytepropeller is not able to do it.

What changes were proposed in this pull request?

As proposed in #5547 we're emitting an event after sending a Delete call to Flyte k8s cluster

How was this patch tested?

With unit tests and staging environment

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

Attention: Patch coverage is 64.51613% with 11 lines in your changes missing coverage. Please review.

Project coverage is 36.32%. Comparing base (66ff152) to head (358151e).

Files with missing lines Patch % Lines
flyteadmin/pkg/workflowengine/impl/k8s_executor.go 76.92% 6 Missing ⚠️
flyteadmin/pkg/rpc/adminservice/base.go 0.00% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5785   +/-   ##
=======================================
  Coverage   36.31%   36.32%           
=======================================
  Files        1304     1304           
  Lines      110048   110067   +19     
=======================================
+ Hits        39964    39981   +17     
- Misses      65928    65930    +2     
  Partials     4156     4156           
Flag Coverage Δ
unittests-datacatalog 51.37% <ø> (ø)
unittests-flyteadmin 55.62% <64.51%> (+0.03%) ⬆️
unittests-flytecopilot 12.17% <ø> (ø)
unittests-flytectl 62.21% <ø> (ø)
unittests-flyteidl 7.12% <ø> (ø)
unittests-flyteplugins 53.35% <ø> (ø)
unittests-flytepropeller 41.93% <ø> (ø)
unittests-flytestdlib 55.35% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andresgomezfrr andresgomezfrr force-pushed the fix-aborting-subworkflows branch from 6f4c335 to bcba038 Compare September 30, 2024 15:16
@RRap0so RRap0so marked this pull request as ready for review September 30, 2024 16:30
andresgomezfrr and others added 5 commits September 30, 2024 20:53
Signed-off-by: Andres Gomez Ferrer <[email protected]>
Signed-off-by: Rafael Raposo <[email protected]>
Signed-off-by: Rafael Raposo <[email protected]>
Signed-off-by: Rafael Raposo <[email protected]>
Signed-off-by: Rafael Raposo <[email protected]>
@RRap0so RRap0so force-pushed the fix-aborting-subworkflows branch from 142ed5a to 358151e Compare September 30, 2024 18:54
Copy link
Contributor

@pvditt pvditt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean against adding this eventing in admin. I feel as though this updating of status should be handled in propeller.

FlyteWorkflow CRDs get finalizers set by default. In your scenario, it seems the delete occurs before propeller is able to set finalizers on the FlyteWorkflow. Let me check if there'd be any issues setting this in the same loop that the resource is created - either in the propeller side admin launcher or just set it by default in admin. Unsure why this isn't just set when the FlyteWorkflow is created.

I think getting the finalizer set would be a better path forward.

@RRap0so
Copy link
Contributor

RRap0so commented Oct 1, 2024

Closing in favour of #5788

@RRap0so RRap0so closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Workflow status is stuck at Aborting
3 participants