Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Renaming workflow #4376

Open
2 tasks done
GeeCastro opened this issue Nov 7, 2023 · 1 comment
Open
2 tasks done

[Core feature] Renaming workflow #4376

GeeCastro opened this issue Nov 7, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@GeeCastro
Copy link

Motivation: Why do you think this is important?

We're onboarding flyte for the following use case

  • Implementation of our ML training pipeline
  • With the same train/evaluation pipeline
  • But different dataset (json file) and different pre-processing steps (also defined in a json file)
  • Easily differentiate and retrain 100s of ML models when needed

It would make sense from my perspective to build a workflow shared by all models. Each experiment would be an execution of the workflow. But browsing through the executions (using the UI, and potentially programmatically in the future) doesn't seem fit for purpose because we can't filter on execution name or anything else useful here.

By defining the workflow in a shared lib and implementing a workflow for each individual model as follows, we can have 1 workflow per model:

# my_ml_model.py

from my_lib import train_wf # <- Shared python function following the workflow DSL
from flytekit import workflow

@workflow
def wf(dataset: dict, processing_config: dict) -> dict:
    return train_wf(dataset=dataset, processing_config=processing_config)

But then the naming of workflow is somehow not that practical because it is based on the folder structure. Renaming the workflow would make our life much easier.

Goal: What should the final outcome look like, ideally?

Having a simple way to rename a workflow with a property or similar. I believe tasks can be renamed easily using .with_overrides(name="my_name") in the workflow DSL. It'd be great to have something similar.

It would allow us to browse models from the search bar and see the list of experiments by date. Without compromising on the project folder structure.

We could also integrate an ML experiment tracking tool pointing to the flyte executions.

Describe alternatives you've considered

Being very new to flyte I went back to the docs and read the launch plan concepts once again https://docs.flyte.org/en/latest/concepts/launchplans.html#what-do-launch-plans-provide.

The key feature seemed to be around fixed inputs and scheduling when I first read it. But perhaps it could solve the use case above by

  1. Defining a shared workflow (training_pipeline.wf)
  2. Defining a launch plan for each model, calling the workflow with relevant "default" inputs (dataset and processing config)
  3. Create a new version of the launch plan for each experiment (this sounds a bit heavy)

Propose: Link/Inline OR Additional context

Slack conversation https://flyte-org.slack.com/archives/CP2HDHKE1/p1698914388559349

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@GeeCastro GeeCastro added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Nov 7, 2023
Copy link

welcome bot commented Nov 7, 2023

Thank you for opening your first issue here! 🛠

@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants