[Core feature] Renaming workflow #4376

GeeCastro · 2023-11-07T14:31:15Z

Motivation: Why do you think this is important?

We're onboarding flyte for the following use case

Implementation of our ML training pipeline
With the same train/evaluation pipeline
But different dataset (json file) and different pre-processing steps (also defined in a json file)
Easily differentiate and retrain 100s of ML models when needed

It would make sense from my perspective to build a workflow shared by all models. Each experiment would be an execution of the workflow. But browsing through the executions (using the UI, and potentially programmatically in the future) doesn't seem fit for purpose because we can't filter on execution name or anything else useful here.

By defining the workflow in a shared lib and implementing a workflow for each individual model as follows, we can have 1 workflow per model:

# my_ml_model.py

from my_lib import train_wf # <- Shared python function following the workflow DSL
from flytekit import workflow

@workflow
def wf(dataset: dict, processing_config: dict) -> dict:
    return train_wf(dataset=dataset, processing_config=processing_config)

But then the naming of workflow is somehow not that practical because it is based on the folder structure. Renaming the workflow would make our life much easier.

Goal: What should the final outcome look like, ideally?

Having a simple way to rename a workflow with a property or similar. I believe tasks can be renamed easily using .with_overrides(name="my_name") in the workflow DSL. It'd be great to have something similar.

It would allow us to browse models from the search bar and see the list of experiments by date. Without compromising on the project folder structure.

We could also integrate an ML experiment tracking tool pointing to the flyte executions.

Describe alternatives you've considered

Being very new to flyte I went back to the docs and read the launch plan concepts once again https://docs.flyte.org/en/latest/concepts/launchplans.html#what-do-launch-plans-provide.

The key feature seemed to be around fixed inputs and scheduling when I first read it. But perhaps it could solve the use case above by

Defining a shared workflow (training_pipeline.wf)
Defining a launch plan for each model, calling the workflow with relevant "default" inputs (dataset and processing config)
Create a new version of the launch plan for each experiment (this sounds a bit heavy)

Propose: Link/Inline OR Additional context

Slack conversation https://flyte-org.slack.com/archives/CP2HDHKE1/p1698914388559349

Are you sure this issue hasn't been raised already?

Yes

Have you read the Code of Conduct?

Yes

The text was updated successfully, but these errors were encountered:

welcome · 2023-11-07T14:31:17Z

Thank you for opening your first issue here! 🛠

GeeCastro added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Nov 7, 2023

eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Nov 9, 2023

eapolinario assigned zeryx Nov 9, 2023

zeryx mentioned this issue Nov 10, 2023

[Core feature] Nick Names for Flyte Constructs #4399

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core feature] Renaming workflow #4376

[Core feature] Renaming workflow #4376

GeeCastro commented Nov 7, 2023

welcome bot commented Nov 7, 2023

[Core feature] Renaming workflow #4376

[Core feature] Renaming workflow #4376

Comments

GeeCastro commented Nov 7, 2023

Motivation: Why do you think this is important?

Goal: What should the final outcome look like, ideally?

Describe alternatives you've considered

Propose: Link/Inline OR Additional context

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

welcome bot commented Nov 7, 2023