Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percentage estimation of ingredients #10704

Open
laurenskz opened this issue Aug 15, 2024 · 0 comments
Open

Percentage estimation of ingredients #10704

laurenskz opened this issue Aug 15, 2024 · 0 comments
Assignees
Labels
🥗🔍 Ingredients analysis https://wiki.openfoodfacts.org/Ingredients_Extraction_and_Analysis

Comments

@laurenskz
Copy link

Problem

The current estimation of ingredients percentages is a good step in the right direction but there is a lot of room for improvement. For example: Suppose we know that a product has the second ingredient sugar and the total carbs per 100 grams are 14. Then we know that the second ingredient <14g/100g. This implies the first ingredient is more than 14%. This is currently overlooked and we make arbitrary assumptions about ratios.

Proposed solution

There is a paper published which uses all available information about mandatory nutrients combined with linear optimization techniques: https://www.sciencedirect.com/science/article/pii/S0889157522001260.

The general idea is as follows. We have n ingredients, we have k known nutrients( mandatory labeling) :
For eachingredient_j (1<=j<=n):
Fetch nutrition info of ingredient_j (since it is simple ingredient there should be something in db). Let nutrient_j_k be nutrient value per 100g of nutrient k for food j.
Then we declare:
quantity_j as double with range (0,1)
add constraint if j>1: quantity_j <= quantity_{j-i}
if food has known percentage add constraint for that.
Now to the smart part:

Let total_nutrient_k be the value of total_nutrient_k for the total product. Then for each k that is known we add constraint:

quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k > 0.99 * total_nutrient_k
quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k < 1.01 * total_nutrient_k

Add following constraints:

sum_i quantity_i >0.99
sum_i quantity_i < 1.01

Now we have a system of linear constraints that can easily be solved by a LP solver. The resulting quantities will sum to one. And this should utilize all available information.

Additional context

The authors note the following: A study with known ingredient compositions shows that estimates are within a 0.9% difference of products’ actual recipes. This would be a huge improvement.

Code pointers

This could very easily be implemented in python using ortools library: https://developers.google.com/optimization/lp/lp_example#python_7 .

Number of products impacted

All products composed of multiple ingredients

Time per product

More accuracy for end user

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🥗🔍 Ingredients analysis https://wiki.openfoodfacts.org/Ingredients_Extraction_and_Analysis
Projects
Status: To do
Status: To discuss and validate
Development

No branches or pull requests

3 participants