-
-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a quality check for number of ingredients #9732
Labels
🧽 Data quality
https://wiki.openfoodfacts.org/Quality
Comments
Related code: lib/ProductOpener/DataQualityFood.pm in the categories taxonomy, add a "minimum_number_of_ingredients:en: 3" property for the mozzarella entry. run "make build_taxonomies", and then add a check in lib/ProductOpener/DataQualityFood.pm (see example for the related "en:ingredients-single-ingredient-from-category-missing" warning) |
Am currently working on this issue |
github-project-automation
bot
moved this from Needs review
to Done
in 🧽 Ensuring Data Quality
Jan 2, 2025
github-project-automation
bot
moved this from To discuss and validate
to Done
in 🍊 Open Food Facts Server issues
Jan 2, 2025
stephanegigandet
pushed a commit
that referenced
this issue
Jan 6, 2025
🤖 I have created a release *beep* *boop* --- ## [2.53.0](v2.52.0...v2.53.0) (2025-01-06) ### Features * data-quality - minimum number of ingredients ([#11152](#11152)) ([d7881d4](d7881d4)), closes [#9732](#9732) * data-quality/apply-remove_insignificant_digits-for-nutriments ([#11147](#11147)) ([a6df72f](a6df72f)) * Top categories for Open Products Facts ([2239473](2239473)) * Top categories for Open Products Facts ([#11171](#11171)) ([2239473](2239473)) ### Bug Fixes * allow serving size to be hyphenated ([#11161](#11161)) ([7c0df2d](7c0df2d)) * Correct indentation, so that CodeQL can work with the code ([#11166](#11166)) ([0178ac2](0178ac2)) * data quality - increase threshold for comparison between fiber and its subnutriments ([#11145](#11145)) ([f0a2682](f0a2682)) * Delete html/images/lang/de/labels/halal.90x90.png ([#11183](#11183)) ([80cf708](80cf708)) * environmental_score ([#11191](#11191)) ([cbe221e](cbe221e)) * fix OPF PR labelling ([e708ae3](e708ae3)) * fix OPF PR labelling ([#11154](#11154)) ([e708ae3](e708ae3)) * fixes for Green-Score ([#11155](#11155)) ([7287d8b](7287d8b)) * green-score link ([#11146](#11146)) ([abf858a](abf858a)) * nutriscore grade from category change for extra virgin olive oils ([#11156](#11156)) ([32d58e0](32d58e0)) * rm nova drilldown field for beauty ([#11193](#11193)) ([3f5b654](3f5b654)) * SonarCloud issues ([#11165](#11165)) ([b84d545](b84d545)) * warnings in import_convert_carrefour_france ([#11189](#11189)) ([4643e3a](4643e3a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem
When cleaning up the Mozzarella category, I noticed that some products have only two ingredients. The most important ingredient (rennet) was missing from the ingredient list. This products can easily be found by plotting the number of ingredients. The plot below shows the buffalo mozzarellas:
Proposed solution
Define a minimum number of required ingredients in the taxonomy. Use this minimum to check products in corresponding category and raise a flag.
Additional context
At the moment there are few quality checks on ingredients. This feature could be an extension on the single ingredient products, which is essentially a maximum number of ingredients.
Number of products impacted
This check is mainly for products where the producer did not list all the ingredients. Hopefully these are not to many.
Time per product
If these are tagged we no longer have to look for them.
The text was updated successfully, but these errors were encountered: