Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop attributes_exclude and keep attributes_include #2154

Closed
tcompa opened this issue Dec 19, 2024 · 4 comments
Closed

Drop attributes_exclude and keep attributes_include #2154

tcompa opened this issue Dec 19, 2024 · 4 comments
Labels
flexibility Support more workflow-execution use cases

Comments

@tcompa
Copy link
Collaborator

tcompa commented Dec 19, 2024

As of fractal-analytics-platform/fractal-web#678 (comment), we are not anymore defining attribute filters for workflow tasks. This introduces a strong simplification, because attribute filters are only ever defined in relation to a dataset, and then the list of possible values is already known - based on the image list.

This prompts us (with @mfranzon) to propose a further simplification, that is, to drop attributes_exclude. In the UI the presence of both include/exclude is made redundant by a button like "select all", as in:

  1. To include a few image attributes, I start from an empty selection and I pick them one by one.
  2. To include a few image attributes, I start from an select-all selection and I unpick them one by one.

The advantages of this simplification are many:

  1. The merging of different filters (e.g. the ones at the dataset and job level) is easier and less error-prone - see Replace attribute filters rather than merging them #2155.
  2. The database models and API schemas become simpler, with less custom validation for schemas and fewer fields for models.
  3. The actual filtering operation (e.g. within the runner) is simpler.

In terms of downside, we can only think about a single edge case, namely the one where new images are added to the dataset after some processing already took place. Here is how it would look like with/without the exclude filter:

With exclude filter:

  1. I run conversion for wells B01 and B02
  2. I run processing excluding B02, because it has some issues. This I can achieve e.g. by setting the dataset.filter to exclude B02. Or I can set the job.filters to exclude B02.
  3. I run conversion for well B03, and this image is added to the dataset
  4. When I run some processing, and B02 is automatically excluded (if I did set it in the dataset filters). Or, if I had set it in the job.filters, I know that I could re-use the same filter again, with no change - ref To discuss: expose "re-use last-job filter" option fractal-web#683.

Without exclude filter:

  1. I run conversion for wells B01 and B02
  2. I run processing for well B01 only, because B02 has some issues. This I can achieve by setting either the dataset.filters or the job.filters to only include B01.
  3. I run conversion for well B03, and this image is added to the dataset.
  4. When I run some processing, I need to either update the dataset filters (so that they also include B03) or to modify the job filters I used in the last job, because I now have to specify that also B03 is included.

We deem this edge case an acceptable trade-off. The user is going back to a "conversion" run, and then they are likely aware that something will have to change in the processing as well. This issue does not appear at all e.g. if all conversion is done in advance, before any processing.

@tcompa
Copy link
Collaborator Author

tcompa commented Dec 19, 2024

Another minor downside of this idea:

If I have a very large number N of possible values for a given attribute, the resulting JSON when I export an object with filters has:

  • A single entry like "exclude=B05", if we support "exclude"
  • A large object like "include=B01,B02,B03,..." (with N-1 elements) if we only support "include"

Depending on typical N values and on how frequent this scenario is, it may make some histories less readable.


Note that handling a large number N of options would clearly not work in the current plan for the UI, as we would never show a dropdown menu with N=1000 options.

@jluethi
Copy link
Collaborator

jluethi commented Dec 19, 2024

Will need to read more carefully. Quick comment on typical Ns:
Plates will have single digit Ns
Wells will go from 10s to 1536 for plate types we know
Acquisitions will typically go from 1 to 50 or so

@jluethi
Copy link
Collaborator

jluethi commented Dec 19, 2024

I'd be fine with not adding exclusion filters for now

@tcompa tcompa changed the title To discuss: Drop attributes_exclude and keep attributes_include Drop attributes_exclude and keep attributes_include Jan 7, 2025
@tcompa
Copy link
Collaborator Author

tcompa commented Jan 7, 2025

This issue was a design discussion and it requires no further action. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flexibility Support more workflow-execution use cases
Projects
Development

No branches or pull requests

2 participants