You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
and was planning to address the artifacts as a_group_of_artifacts:outputs, a_group_of_artifacts:errors etc.
But it turns out that Kedro does not support this?
DatasetError: An exception occurred when parsing config for dataset 'a_group_of_artifacts':
'type' is missing from dataset catalog configuration
Context
Our pipelines mostly augment the initial inputs, which means we end up with a lot of similarly named artifacts (e.g. final_outputs, processed_outputs and other kinds of _outputs) which gets confusing. It feels that there should be a better way to group/namespace the artifacts.
Possible Implementation
Instead of treating the 1st-level YAML blocks as artifacts, why not traverse the levels recursively until a block with type is encountered -- and treating it as artifact while ignoring the other nesting blocks?
Possible Alternatives
Maybe some other solution I don't know about? Not a Kedro expert...
The text was updated successfully, but these errors were encountered:
Hey @namedgraph, thank you for your feature proposal. Your idea makes sense, but as of now, Kedro does not support grouping artifacts in the manner you describe, and interprets each entry on the catalog as a separate data source with it's own type definition.
For now, you can try to use Kedro dataset factories to reduce the number of similar catalog entries on your project.
Description
I tried grouping the artifacts by introducing "namespaces" as the first level of config in YAML while moving the actual artifacts to the second level:
and was planning to address the artifacts as
a_group_of_artifacts:outputs
,a_group_of_artifacts:errors
etc.But it turns out that Kedro does not support this?
Context
Our pipelines mostly augment the initial inputs, which means we end up with a lot of similarly named artifacts (e.g.
final_outputs
,processed_outputs
and other kinds of_outputs
) which gets confusing. It feels that there should be a better way to group/namespace the artifacts.Possible Implementation
Instead of treating the 1st-level YAML blocks as artifacts, why not traverse the levels recursively until a block with
type
is encountered -- and treating it as artifact while ignoring the other nesting blocks?Possible Alternatives
Maybe some other solution I don't know about? Not a Kedro expert...
The text was updated successfully, but these errors were encountered: