[Core feature] Allow passing additional metadata in inputs/outputs of tasks/workflows #4654
Open
5 of 6 tasks
Labels
enhancement
New feature or request
Motivation: Why do you think this is important?
There are a few instances we faced in the past where we wanted to pass additional contextual information about the inputs/outputs of a task that are not themselves inputs/outputs (The most recent example is flyte decks). Unfortunately the way we had to pass that is by convention. The engine (i.e. Propeller) assumes a task writes decks, checks for the file existence and bails if it's not there. Conventions are not very strong :-) and limit the optimizations the system can make.
Another instance of this is executing, generating and exporting input validation report from Great Expectations or Pandera. This is not really an output but rather metadata about the input/output dataframes.
Goal: What should the final outcome look like, ideally?
This is a complex change and its complexity stems from a few aspects:
Here is a proposed breakdown of the work particularly around the component boundaries:
flyteidl
and all backend code. This code should default to reading the newInputData
andOutputData
and fallback toLiteralMap
if unmarshaling fails, this will ensure oldpropeller
can continue to work with a neweradmin
(but not vis versa)flytepropeller
writesinputs.pb
for a task, it should gate that behavior based on the version of flytekit. If it's not aflytekit
runtime or it's an old version, it should default to writing the old format until other SDKs are updated.flytekit
to do the same.flyteconsole
to do the same.The rollout plan is:
As long as flyteadmin is updated first, the system assumptions will hold.
Describe alternatives you've considered
Continue to use conventions to pass data... not good
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: