Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List categorization questions for agribalyse #902

Closed
alexgarel opened this issue Sep 19, 2022 · 7 comments
Closed

List categorization questions for agribalyse #902

alexgarel opened this issue Sep 19, 2022 · 7 comments
Labels
✨ enhancement New feature or request

Comments

@alexgarel
Copy link
Member

alexgarel commented Sep 19, 2022

Since:

we now have an API to list categorizations opportunities by type of questions.

But our goal is

To do that, we would like to have the same request but only for agribalyse categories

An flexible way to do that is to add a parameter with_property=agribalyse_food_code

That will:

  1. search for all values in the taxonomy that have this property (here: agribalyse_food_code)
  2. in the request filter the value_tag using the "IN" operator

Implementation details to be discussed

@alexgarel
Copy link
Member Author

@raphael0202 maybe it's the wrong way of thinking about it.

I see more than one way to do it:

  1. either we make it simple, by:
    • searching for all value_tag thanks to Taxonomy object (and cache it !)
    • putting that as values to a "value_tag IN" filter in the query
  2. either we want to avoid submitting queries that are very long (2400 agribalyse) by:
    • searching for all value_tag thanks to Taxonomy object
    • build a table listing those entries on the fly with a specific name tags_<taxonomy_name>_<property_name>
    • use it as a subquery to a "value_tag IN"

In case 2 refresh might be slightly more difficult to handle (but it's a remove / recreate).

I don't really know if there will be a big difference and if it's worth going for option "2".

Or maybe someone sees a different idea ? (@raphael0202, @alexfauquette)

@alexgarel
Copy link
Member Author

A third possibility: have a hand made list of categories in a simple file and put it in a table akin to solution 2.

@alexfauquette
Copy link
Member

Not supper familiar with robotoff code base, but would it be possible to add two fields:

  • help_nutriscore
  • help_ecoscore

that would be false by default and true it the prediciton is helping to compute the corresponding score

@raphael0202
Copy link
Collaborator

During insight creation we may also set a flag in data->is_agribalyse_category, I think it would be the simplest way of doing it.
But we should first discuss the relevance of categorizing agribalyse-specific categories in priority, I'm not convinced yet it's the best approach.

@raphael0202
Copy link
Collaborator

Another very flexible way to implement this would be to add a tag (or campaign) to product_insight, which is a list of string (JSONB) that can be used to create "annotation campaigns" and to select only a subset of questions during question retrieval.
All category predictions that are agribalyse categories could be flagged with the tag agribalyse.
We would check if the category is an agribalyse category during insight import.

And we would display only insights from this campaign with:
https://hunger.openfoodfacts.org/questions?type=category&campaign=agribalyse

This method is flexible enough for other campaigns where we want to focus on a subset of products.
For short campaigns, only a DB update would allow adding the tag to specific products.

@alexgarel
Copy link
Member Author

@raphael0202 later on, we could even add an API to add a list of codes to a campaign :-) and maybe integrate it with mass edit or something like that. So that any advanced user could create a campaign.

@raphael0202
Copy link
Collaborator

Implemented, closing the issue.

Repository owner moved this from Todo to Done in 🤖 Artificial Intelligence @ Open Food Facts Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ enhancement New feature or request
Development

No branches or pull requests

3 participants