-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow new attributes to be added to DataSets #400
Comments
I've logged this for us to discuss. Thanks! I vaguely remember seeing a similar feature request around. If anybody remembers it, can you please link it? :) |
Thanks @mzjp2, glad to hear it's in the discussion. I am definitely open to any better suggestions. Right now my solution is custom datasets for everything, and Its a bit overkill for this one application. |
Potentially related to #163 ? |
I think the use cases are different. I think in my use case I want to be able to add attributes to datasets that become part of the dataset in a way that plugins can interact with them. The attribute is specific to the dataset, not generic to the whole project. I can definitely see use cases combining the If I were to describe the most general use case. I want to be able to create a plugin that does something, and I want to be able to have attributes on the dataset that might tell the dataset something as simple as to skip. After thinking about the generic use case a similar argument could be useful on the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still dreaming of being able to add additional attributes to datasets so that I can access them in hooks. Is this something the kedro team is interested in allowing? |
Hi @WaylonWalker! This is something we'd like to solve and we have an internal issue to address this workflow. However, it's not a top priority at the moment so I can't give any estimate about when it would be finished. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Closing this in favour of: #1076 |
Description
I have certain attributes to track within my datasets and have created custom DataSets to get around this issue. Now that hooks are out most of my reasons for custom DataSets are gone, and I can achieve the same thing with an
after_node_run
hook, but I still cannot attach custom attributes to datasets.Use Case 1 (can I share this dataset)
I would like to attach things like confidentiality to the dataset so that team members can easily know who they can share a dataset with by looking at an attribute on the dataset. Ideally, I would like to add these to the catalog.
Use Case 2 (can I delete this sub_pipeliene)
I would also like to be able to check the pipeline health in CI, one thing that I would like to look for is dangling edges that are useless. Sometimes during refactoring we switch to a new section of the pipeline, the old one gets disconnected, never removed, and now we wonder if anyone is using that output. It would have been nice to have CI tell us that we need to mark that dataset as a final output or remove the section of pipeline.
Possible Implementation
The AbstractDataset's would need to accept the attributes keyword, then attach the attributes to each instance.
The text was updated successfully, but these errors were encountered: