Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to clear past errors from workflow state #6

Draft
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

Tom-Newton
Copy link
Owner

@Tom-Newton Tom-Newton commented Dec 19, 2023

Tracking issue

flyteorg#4569

Why are the changes needed?

Reduce un-needed information stored in etcd when using failure_policy=WorkflowFailurePolicy.FAIL_AFTER_EXECUTABLE_NODES_COMPLETE. This allows flyte to scale to larger workflows before hitting etcd size limits.

What changes were proposed in this pull request?

  • When using FAIL_AFTER_EXECUTABLE_NODES_COMPLETE delete the previous error from the workflow state every time a new error occurs.
  • This feature is behind a feature flag clear-previous-error which defaults to false (maintaining the original behaviour).
  • Update the TestWorkflowExecutor_HandleFlyteWorkflow_Failing test so that more than one node fails.
  • Add sub test cases within TestWorkflowExecutor_HandleFlyteWorkflow_Failing for all the combinations of OnFailurePolicy and clear-previous-error.

How was this patch tested?

Updated unittests

Setup process

Screenshots

Check all the applicable boxes

Related PRs

This is part 2
Part 1 was implemented by flyteorg#4596

Docs link

@Tom-Newton Tom-Newton force-pushed the tomnewton/remove_error_messages_from_crd branch from 62f6154 to 2b76a91 Compare December 21, 2023 19:32
@Tom-Newton Tom-Newton force-pushed the tomnewton/collapse_sub_nodes_on_failures branch from 05208ec to 64d7fa2 Compare January 12, 2024 18:12
Signed-off-by: Thomas Newton <[email protected]>
This reverts commit dab428d.

Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>
@Tom-Newton Tom-Newton force-pushed the tomnewton/remove_error_messages_from_crd branch from 2b76a91 to 216e1a3 Compare January 16, 2024 20:42
@Tom-Newton Tom-Newton changed the base branch from tomnewton/collapse_sub_nodes_on_failures to master January 16, 2024 20:42
@Tom-Newton Tom-Newton force-pushed the tomnewton/remove_error_messages_from_crd branch from 852f527 to 6c5650c Compare January 16, 2024 20:56
Signed-off-by: Thomas Newton <[email protected]>
@Tom-Newton Tom-Newton force-pushed the tomnewton/remove_error_messages_from_crd branch from f7c5f80 to 2cea075 Compare January 17, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant